Agno, a lightweight multimodal agent framework, builds a private AGI platform like building Lego

Written by
Caleb Hayes
Updated on:July-11th-2025
Recommendation

A new option for building a private AI middle platform, building high-performance AI agents as easily as building Lego.

Core content:
1. Comprehensive functions and easy-to-use methods of the Agno framework
2. Seamless integration of large language models, management of agent state and memory
3. Practical cases and application scenarios of using Agno to build AI agents in specific fields

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

Agno is an open source framework designed to create advanced AI agents with capabilities such as memory, knowledge retention, tool integration, and complex reasoning. It enables developers to seamlessly integrate any large language model (LLM), manage the state and memory of the agent, and even coordinate multiple agents working together.

The real power of Agno lies in its ability to build high-performance, domain-specific AI agents and provide tools to monitor and optimize them in production. Because it is simple enough , integrating Agno in your business scenario will hardly have any impact.

Any scenario you can think of can be easily implemented using Agno, such as:

  • • Automatic approval and reporting in OA
  • • Automatically improve and process requirements in the project
  • • Order tracking during logistics
  • • General data processing, BI, etc.

AI Engineering is Software Engineering

From an official perspective, what scenarios is Agno suitable for?

If you are building an AI product,

  • •  80% of the solution will be standard Python code ,
  • •  The remaining 20% ​​will be automated using Agents .
    • • Even though it is said to be 20%, I think it is too much. After using it, I found that it is only necessary to write one or two lines of code where the knowledge is needed.

Agno is designed for this type of use case.

Of course, building agents with Agno is simple. You only need to write AI logic using familiar programming structures (if, else, while, for) and avoid using complex abstractions such as graphs and chains.

Here is a simple proxy that can search the web (provided in the search tool framework):

from  agno.agent  import  Agent
from  agno.models.openai  import  OpenAIChat
from  agno.tools.duckduckgo  import  DuckDuckGoTools

agent = Agent(
    model=OpenAIChat( id = "gpt-4o" ),
    tools=[DuckDuckGoTools()],
    markdown= True
)
agent.print_response( "What's the latest news in Beijing?" , stream= True )

For a more complex example, you can set the proxy description,instructions,expected_output:

agent = Agent(
    model=OpenAIChat( id = "gpt-4o" ),
    tools=[ExaTools(start_published_date=today,  type = "keyword" )],
    description=dedent( """ """ ),
    instructions=dedent( """ """ ),
    expected_output=dedent( """ """ ),
    markdown= True ,
    show_tool_calls = True ,
    add_datetime_to_instructions= True ,
)
  • • description will be added before the system prompt word;

Structured output

Another particularly useful feature is structured output, which is much easier than writing prompt definitions by hand. expected_output More convenient.

class MovieScript ( BaseModel ): 
    setting:  str  = Field(..., description = "Provide a wonderful scene setting for a blockbuster movie." )
    ending:  str  = Field(..., description = "The ending of the movie. If not specified, provide a comedy ending." )
    genre:  str  = Field(
        ..., description = "The genre of the movie. If there is nothing special, you can choose action movies, thrillers or romantic comedies."
    )
    name:  str  = Field(..., description = "Give this movie a name" )
    characters:  List [ str ] = Field(..., description = "Names of characters in the movie." )
    storyline:  str  = Field(..., description = "Describe the storyline of the movie in 3 sentences. Make it exciting!" )

json_mode_agent = Agent(
    model=model,
    system_message = "You are a movie script writer. All answers must be in Chinese." ,
    description= "You are a movie script writer. Please answer all questions in Chinese." ,
    response_model=MovieScript,
    references_format = 'json' ,
    debug_mode = True ,
)

Multimodal Agents

Agno agents support text, image, audio, and video input, and can generate text, image, audio, and video output.

For example, output image:

image_agent = Agent(
    model=OpenAIChat( id = "gpt-4o" ),
    tools=[DalleTools()],
    description= "You are an AI agent that can generate images using DALL-E." ,
    instructions= "When the user asks you to create an image, use the `create_image` tool to create the image." ,
    markdown= True ,
    show_tool_calls = True ,
)

image_agent.print_response( "Generate an image of a white siamese cat" )

images = image_agent.get_images()
if  images  and isinstance (images,  list ):
    for  image_response  in  images:
        image_url = image_response.url
        print (image_url)

tool

Tools are the soul of an agent, and it is very easy to customize tools in Agno. Just define a function and write a comment:

def get_top_hackernews_stories ( num_stories:  int  =  10 ) ->  str : 
    """Use this function to get the top stories on Hacker News.

    parameter:
        num_stories (int): Number of stories to return. Defaults to 10.

    return:
        str: JSON string of the top stories.
    """


    # Fetch top story IDs
    response = httpx.get( 'https://hacker-news.firebaseio.com/v0/topstories.json' )
    return  ...

agent = Agent(
    model=llm,
    tools=[get_top_hackernews_stories], 
    debug_mode = True ,
    show_tool_calls = True ,
    markdown= True ,
    description= "You are a helper that helps users obtain and summarize the hot stories on Hacker News." ,
    instructions=[ "Please answer all questions in Chinese""Provide a concise and clear summary" ]
)

For more information, see

  • • Documentation:  https://docs.agno.com/introduction [1]
  • • Examples:  https://docs.agno.com/examples/introduction [2]

Agno also provides  the PlayGround application [3] that allows you to integrate monitoring into your applications.