Agent Frameworks
Some of the most popular AI frameworks:
- CrewAI
- Autogen
- AgentStack - built on top of the previous two frameworks.
Personal favorite:
- Eliza - primarily made for social media and community management, but has generic capabilities as well with built-in RAG, Retrieval-Augmented Generation, and vector DB interface.
Why? It is all in TypeScript. A language I much prefer to the predominantly popular Python among the data crowd. I have a dream that Python disappears one day, with all the “coders” and data folks that use it being replaced by AI agents who rewrite themselves into a more sensible language.
Models
LLM models accessible via the API are no longer cutting edge. They have been available for many years now. Some popular cloud-accessible models are:
- GPT-4o
- Claude 3.5
- Llama Cloud
- Llama Local
- RedPill
Supporting Tools
In addition to frameworks that leverage prebuilt models, the bells and whistles products, aka supporting toolchains, are now surprisingly extensive as well. They support countless bits of functionality, but my favorites are custom voice and video avatar solutions.
Voice
My favorite custom voice service is offered by ElevenLabs. The leading open source alternative is F5-TTS, which I have yet to be hands on with. The convenience of ElevenLabs offering is undeniable.
I would be remiss not to mention the RealTime API by OpenAI that is currently in beta. The natural conversational ability that the API serves is spectacular. It even supports “prompting for voices” which allows you to dictate a prompt simply by the inflections in your voice. Once they allow using custom voice models it will be the best offering on the market.
Avatar
Simli’s offering is the one I’m most familiar with, however, I’m not terribly impressed by it. I would like to get hands-on experience with the OSS project EchoMic to allow for more customizability.
Demo
The best way to keep on top of these tools is to use them! So, using Eliza, I cloned myself, my mother-in-law, and Santa into an AI agent with knowledge, memories, and uniquely styled communication patterns. They were powered by OpenAI’s GPT-4o, and I used ElevenLabs and Simili for the voices and avatars. Putting it all together, I had AI agents that my family could converse with this Thanksgiving. Check it out.