Less than 100 lines of code to implement a simple general intelligent LLM Agent

Written by
Jasper Cole
Updated on:June-27th-2025
Recommendation

Use less than 100 lines of Python code to create your own intelligent LLM Agent and explore the infinite possibilities of AI.

Core content:
1. Innovative ideas and practical applications of LLM Agent
2. Analysis of core principles and function calling mechanism
3. Complete code reproduction from design to implementation

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

1. Introduction: New ideas of LLM Agent

In the field of artificial intelligence, large language models (LLMs) such as the GPT series have demonstrated excellent performance in a variety of applications with their powerful text generation capabilities. However, traditional LLMs are mainly good at generating text and lack the ability to interact with the external environment, and cannot perform specific tasks such as calculations, web crawling, or file parsing. To overcome this limitation, the concept of LLM Agent came into being: by combining LLM with external tools, the model can not only conduct conversations, but also achieve more complex tasks through function calls.

This article will share how to use less than 100 lines of Python code to implement a simple LLM Agent with general intelligence potential. You will see the entire implementation process - from core principles, prompt tuning, tool interface design to main loop interaction, and get a detailed explanation of the complete reproducible code. I hope this article can inspire you to develop more agents that meet actual needs.

When building an LLM Agent, prompt engineering is a key step. Through carefully designed prompts, the model can be guided to generate more accurate and useful outputs.

Through the practice of this article, you will learn how to build a feature-rich LLM Agent within a limited amount of code, providing new ideas and methods for the development of artificial intelligence applications.

2. Detailed explanation of core principles

Before we start implementing it, we need to understand several key concepts and the overall operation logic of the Agent. Mastering these core mechanisms will help you deeply understand the design philosophy behind the LLM Agent and build your own intelligent assistant more flexibly in actual development.

2.1 Capabilities and Limitations of Large Language Models (LLMs)

LLM, such as GPT-4 and Qwen, is trained based on massive amounts of text data and generates natural and fluent content by predicting the probability of the next word, demonstrating amazing language understanding and generation capabilities. However, LLM is essentially a language predictor, and this mechanism naturally has the following limitations:

Insufficient calculation and logical reasoning

Although the model can recognize and generate mathematical expressions, it often makes mistakes and lacks accurate reasoning capabilities when it comes to more complex operations (such as large number multiplication, date calculation, and precise numerical calculation).

Knowledge is of limited timeliness

LLM’s knowledge base usually cannot be automatically updated after the training data deadline. For example, GPT-4’s knowledge ends in September 2021, so it cannot answer real-time questions or obtain the latest information.

No direct access to external resources

LLM itself does not have the ability to directly execute code, query databases, crawl web pages, or process files. It cannot complete tasks that require cooperation from the external environment based on pure text prediction.

To make up for these limitations, we can introduce a "Function Call" mechanism, which allows the model to enhance its own capabilities by calling external tools, indirectly giving the model similar "hands-on" execution capabilities.

2.2 Function Call: The key mechanism that enables LLM

Function Call is one of the most important means to improve the practicality of LLM. This mechanism allows the model to issue clear instructions in the form of structured data in addition to generating text, calling predefined external functions or tools.

The operation mechanism of Function Call can be broken down into the following steps:

Requirements Identification

When the LLM realizes that the current task is beyond its capabilities during interaction, it will actively trigger an external function call request. For example, when faced with a difficult math problem or a web page query task, the model will determine on its own that external assistance is needed.

Call parameter generation

The model specifies the name of the function to be called and the specific parameters required in a structured data format (usually JSON). For example, the model may output something like:

{ "function_name": "calculate_date_diff", "parameters": { "start_date": "2024-02-12", "end_date": "2025-03-15" }}

External Execution

After receiving the function call request provided by the model, the system executes predefined external tools or functions (such as executing Python code, calling a network interface, accessing a database, etc.) and obtains the actual execution results.

Results Feedback

The execution results are then passed back to the model, which will generate the natural language answer expected by the user based on the context. This two-way interaction expands the capabilities of LLM from plain text output to actual task execution.

The Function Call mechanism not only breaks through the limitations of the model itself, but also enables the model to have a "self-decision-making" ability, realizing a truly "hands-on" and "self-correcting" mechanism, greatly improving its practicality and application value.

2.3 The complete operation mechanism of Agent

After understanding LLM and function calling mechanism, we can better understand the workflow of a complete LLM Agent. This process generally includes five key links:

(1) Task perception

The agent first receives and understands the user's input request and understands the user's true intention. At this time, good prompt engineering will improve the model recognition accuracy.

(2) Task decomposition and planning

For complex tasks, the Agent will automatically decompose the task into several small and easy-to-perform steps. For example, the task of "querying the latest weather" can be divided into several subtasks: "get the current location → query the weather API → organize the results → reply to the user".

(3) Decision making and tool invocation

Based on the task planning results, the agent generates a specific function call request, decides when to call which external tool, and the specific call parameters. This step tests the prompt design, model reasoning ability, and interface clarity of tool design.

(4) Tool execution and result feedback

The external environment executes the tools specified by the model (such as the Python code execution function exec_python_code) and returns the execution results to the model. For example, the model requests the execution of code for mathematical calculations or web data crawling, and the results are returned to the model in the original way after execution.

(5) Cycle and complete tasks

The agent continuously adjusts the plan and makes decisions based on the returned execution results, and enters a new round of tool call loop until the task assigned by the user is fully completed and the final answer is given.

Through the cyclic interaction of the above five steps, LLM Agent is able to solve problems more complex than those predicted by a single model, and gradually demonstrates the rudiments of general intelligence.

(It is recommended that readers draw this interaction flow chart by themselves during the actual development process to facilitate understanding of the overall architecture of the Agent.)

After understanding these core mechanisms, we will gradually enter the practical stage in the next chapter, discussing in detail Prompt design, tool interface construction, and how to develop an efficient and general LLM Agent with less than 100 lines of code.

3. Step-by-step deconstruction: Implementing a simple general intelligent agent

Next, we explain each step of the implementation code in detail so that readers can fully reproduce this Agent from scratch.

3.1 Environment preparation and library import

First, we need to install the necessary dependencies (taking the openai library as an example) to ensure that the Python environment can call the API correctly.

Installation dependencies:

pip install openai

Import necessary libraries and initialize the model client:

import  osimport  jsonfrom  openai  import  OpenAI
# Set the model name. Here we use qwen-plus, which can be replaced according to actual conditions.MODEL_NAME =  "qwen-plus"client = OpenAI(    api_key=os.environ.get( "OPENAI_API_KEY" ),    base_url= "https://dashscope.aliyuncs.com/compatible-mode/v1" ,)

The above code is responsible for establishing a connection with the API and preparing for subsequent conversation interactions.

Here I use Alibaba Cloud Bailian platform and qwen-plus as the model. Readers can replace it with their own platform and model. Note that this model needs to support function call.

3.2 Define a general utility function: exec_python_code

In order to achieve general intelligence, we must enable the model to "execute" any Python code. The exec_python_code function here is the core, which allows the model to generate code and return the result after the environment executes it.

def exec_python_code(code: str): print(code) # Print code for easy debugging try: exec_env = {} exec(code, exec_env, exec_env) return {"result": exec_env, "error": None} except Exception as e: print("Execution failed, reason: %s" % str(e)) return {"result": None, "error": str(e)}

Security Tip: Using exec is risky. In a production environment, be sure to consider using sandboxing to strictly isolate the executed code.

3.3 Tool interface definition: telling the model which tools can be called

In order to let LLM know that it can call exec_python_code, we need to define the tool interface (Tool List) and specify the parameter requirements.

tools = [ { "type": "function", "function": { "name": "exec_python_code", "description": "Execute arbitrary Python code and return execution results or error information.", "parameters": { "type": "object", "properties": { "code": { "type": "string", "description": "The Python code snippet to be executed." } }, "required": ["code"] } }, "strict": True }]

This tool definition not only includes the function name and description, but also specifies the JSON Schema of the parameters. In this way, when the model decides to call the function, a structured call request will be generated according to this interface.

3.4 Well-designed Prompt: Guiding the Model to Make Autonomous Decisions

In practice, I spent a lot of time tuning Prompt to ensure that the model can accurately break down tasks, make step-by-step plans, and call tools as required. Here is an example of Prompt that has been repeatedly optimized:

messages = [{ "role": "system", "content": ( "You are an efficient, flexible and persistent intelligent assistant who can analyze and disassemble the tasks proposed by the user independently." "Always provide effective solutions by executing Python code.\n\n" "The tools you have and must call are as follows:\n" "1. exec_python_code: execute Python code and return execution results or error information.\n\n" "When executing tasks, you must strictly follow the following principles:\n" "A. Before calling a tool each time, give a clear step-by-step execution plan (step title and purpose).\n" "B. Python code is powerful and can access system information, network resources, and file systems." "Preferentially use the standard library to complete tasks and try to avoid relying on third-party libraries.\n" "C. If you need to access a specific web page or information, you must first execute Python code to call a search engine (such as Baidu or Google) to search and determine the real URL of the target page." "It is forbidden to guess the URL yourself.\n" "D. If the environment lacks a specific library or the method fails, you must actively try other alternatives instead of interrupting the task.\n" "Special attention:\n" "- Must ensure that the returned results are highly relevant to the user's task, and never use unverified URLs or irrelevant information. \n" "- Must proactively and thoroughly solve problems, and not make general statements or tell the task cannot be completed. \n" )}]

This detailed and rigorous instruction guides the model to maintain high relevance and accuracy during task breakdown and execution.

3.5 Main loop: model interaction and tool calling

Finally, we implement a main loop that continuously interacts with the model to complete the closed-loop process from user input to task execution and then to feedback.

while  True :    # Get user input    user_question =  input ( "$ " )    messages.append({ "role""user""content" : user_question})
    #Call the model. The model may generate a function call request    completion = client.chat.completions.create(        model=MODEL_NAME, messages=messages, tools=tools    )
    while  True :        # Add the message generated by the model to the conversation history        messages.append(completion.choices[ 0 ].message)        if  completion.choices[ 0 ].message.content:            print (completion.choices[ 0 ].message.content)        if  completion.choices[ 0 ].finish_reason ==  "stop" :            break        # If the model requests a tool call        for  tool_call  in  completion.choices[ 0 ].message.tool_calls:            name = tool_call.function.name            args = json.loads(tool_call.function.arguments)            if  name ==  "exec_python_code" :                result = exec_python_code(args[ "code" ])            messages.append({                "role""tool" ,                "tool_call_id" : tool_call.id ,                "content"str (result)            })        # Call the model again to generate the final answer based on the results returned by the tool        completion = client.chat.completions.create(            model=MODEL_NAME, messages=messages, tools=tools        )

In this code, we continuously update the messages list to ensure that the model sees the complete context every time. The model determines whether the tool needs to be called. If so, it executes the corresponding exec_python_code and feeds the return result back to the model until the task is completed.

4. Complete running example display

To verify the general intelligent capability of Agent, let's take a practical example - getting the capacity of all hard disks on my computer.

Sample request:

$ Get all the hard disk capacity of my computer### Step 1: Import necessary modulesPurpose: To obtain hard disk capacity information, we need to use the `psutil` library, which provides many system information access interfaces.
### Step 2: Get hard disk partition and capacity informationPurpose: Get information about all hard disk partitions through the `psutil.disk_partitions()` function, and get the usage of each partition through the `psutil.disk_usage()` function.
Now I will execute the Python code to get this information.import  psutil
# Get information about all disk partitionsdisk_partitions = psutil.disk_partitions()all_disks_info = []
for  partition  in  disk_partitions:    try :        # Get the usage of a single partition        usage = psutil.disk_usage(partition.mountpoint)        disk_info = {            'device' : partition.device,            'mountpoint' : partition.mountpoint,            'fstype' : partition.fstype,            'total' : usage.total,            'used' : usage.used,            'free' : usage.free,            'percent' : usage.percent        }        all_disks_info.append(disk_info)    except  PermissionError:        # In some cases, you may encounter permission issues, which are simply ignored here        continue
all_disks_infoFailed with No module named  'psutil'It seems that the `psutil` library is not installed in the current environment, which prevents us from directly obtaining the hard disk capacity information. However, I can try to use other standard library methods to obtain relevant information, although it may not be as direct as `psutil`.
### Step 3: Try to use the standard library to obtain system disk informationPurpose: To obtain disk space information indirectly through standard library functions such as `os` and `shutil`.
Next I will try to execute the Python code to achieve this.import  osimport  sysimport  shutil
def  get_drives ():    if  sys.platform ==  'win32' :        importstring        from  ctypes  import  windll
        drives = []        bitmask = windll.kernel32.GetLogicalDrives()        for  letter  in  string.ascii_uppercase:            if  bitmask &  1 :                drives.append(letter +  ':\\' )            bitmask >>=  1        return  drives    else :        # For non-Windows systems, we simplify the process and only list the root directory        return  [ '/' ]
all_disks_info = []for  drive  in  get_drives():    try :        total, used, free = shutil.disk_usage(drive)        disk_info = {            'drive' : drive,            'total' : total,            'used' : used,            'free' : free        }        all_disks_info.append(disk_info)    except  Exception  as  e:        print ( f"Error accessing  {drive}{e} " )
all_disks_infoI have successfully obtained the hard drive capacity information for this machine. Here is the total capacity, used space, and available space for each drive:
1.  Drive C:\   - Total capacity:  1023  GB   - Used space:  918  GB   - Available space:  105  GB
2.  Drive D:\   - Total capacity:  3791  GB   - Used space:  3470  GB   - Available space:  320  GB
3.  Drive E:\   - Total capacity:  8001  GB   - Used space:  1551  GB   - Available space:  6449  GB
Please note that these values ​​are calculated in bytes, and I converted them to GB for easier reading. If you have further needs or questions, please feel free to let me know!

This example shows how the agent can autonomously plan, generate code, perform operations, automatically correct errors, and provide feedback. It demonstrates the potential of general intelligence. I believe you can immediately feel the difference between it and directly using a large model.

5. 200 lines of code to implement a more general Agent

Based on the simple version, I also implemented a complex version of Agent, adding the following tools:

  • google_search : use search engines to obtain web links;

  • fetch_webpage_content : fetch and parse the webpage content of the specified URL;

  • parse_pdf : Parse text from a local PDF file.

These extensions allow Agent to handle more complex tasks, such as:

  • Automatically search for the latest information from the Internet and summarize the content;

  • Capture and analyze long articles or reports;

  • Perform full-text parsing and summary generation on PDF documents.

The implementation idea of ​​the complex version is similar to that of the simple version. It only adds additional processing steps to the tool interface and execution logic, while maintaining a high degree of versatility and flexibility. Readers can gradually expand and debug more functions based on the simple version code.

Sample request:

$ Download the DeepSeek R1 paper released by DeepSeek to E:\output and briefly evaluate it. To complete this task, I will follow the following steps:### Step 1: Search for download links for DeepSeek R1 papers - **Purpose**: Find the official or trusted source download link for DeepSeek R1 papers. - **Tool**: google_search - **Keyword**: "DeepSeek R1 paper download"### Step 2: Download the paper - **Purpose**: Use the found link to download the DeepSeek R1 paper to the specified directory E:\output. - **Tool**: This step may require the use of Python code to implement file downloads. If the search result directly points to a PDF file, you can try to get the file directly; otherwise, I will write a script to perform the download operation. ### Step 3: Read and evaluate the paper - **Purpose**: Give a brief overview and preliminary evaluation of the downloaded paper content. - **Tool**: parse_pdf is used to parse the downloaded PDF document, and then give my opinion based on the parsed content. Now let's start with the first step, searching for download links for the DeepSeek R1 paper. From the search results, we find several possible links pointing to the DeepSeek R1 paper. The most directly relevant are the first link: [PDF file on GitHub](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf) and the second link [arXiv page](https://arxiv.org/abs/2501.12948), which also provides a PDF version. ### Step 2: Download the paper I will try to download the PDF document directly to the specified location E:\output using the provided GitHub link. If this method is not feasible, we will turn to other provided links as an alternative. Now, I will write a piece of Python code to accomplish this task. import requests, osurl = 'https://github.com/deepseek-ai/DeepSeek-R1/raw/main/DeepSeek_R1.pdf'output_path = 'E:/output/'file_name = url.split('/')[-1]ifnot os.path.exists(output_path): os.makedirs(output_path)response = requests.get(url)with open(os.path.join(output_path, file_name), 'wb') as f: f.write(response.content)print(f'File saved to {os.path.join(output_path, file_name)}')File saved to E:/output/DeepSeek_R1.pdf The paper has been successfully downloaded to `E:\output\DeepSeek_R1.pdf`. ### Step 3: Read and evaluate the paper Next, I will use the `parse_pdf` tool to parse the downloaded PDF document and give a brief overview and preliminary evaluation based on its content. Let's start parsing the document. ### Paper Overview This paper introduces the first generation of reasoning models developed by the DeepSeek team, namely DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero is a model trained directly by large-scale reinforcement learning (RL) without supervised fine-tuning (SFT). It demonstrates excellent reasoning ability and naturally emerges many powerful and interesting reasoning behaviors. However, the model also suffers from problems such as poor readability and language confusion. To address these problems and further improve reasoning performance, the researchers introduced DeepSeek-R1, which incorporates multi-stage training and cold start data before RL. DeepSeek-R1 performs comparable to OpenAI-o1-1217 on reasoning tasks. In addition, to support the research community, they open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models based on Qwen and Llama (1.5B, 7B, 8B, 14B, 32B, 70B). ### Brief Evaluation#### Advantages - **Innovation**: This work demonstrates that pure reinforcement learning methods can effectively motivate language models to develop strong reasoning capabilities without relying on supervised fine-tuning data. - **Open Source Contribution**: The open source code includes not only the final version of the model, but also the intermediate version and a series of base models of different sizes, which greatly promotes research and development in related fields. - **Performance Improvement**: Through a carefully designed training process, DeepSeek-R1 not only achieves excellent results in logical reasoning tasks such as mathematics and programming, but also has good performance in other fields such as writing and question answering. #### Room for Improvement - **Language Consistency**: Although a reward mechanism for language consistency is introduced to alleviate the problem of language mixing, there is still room for improvement. Especially when dealing with non-English or Chinese problems, how to maintain language consistency is a direction worth exploring. - **General capabilities**: Compared with the previous version DeepSeek-V3, DeepSeek-R1 is slightly weaker in some functions such as function calls and multi-round dialogues. In the future, we can consider how to use long chain thinking (CoT) to enhance the performance in these aspects. Overall, DeepSeek R1 represents an important advancement in the current large model field, especially in exploring how to use reinforcement learning to enhance the reasoning ability of language models. This work is not only of great significance to the academic community, but also provides valuable reference value for practical applications in the industry. This example demonstrates stronger and more general agent capabilities and potential.

This example demonstrates the capabilities and potential of a more powerful and general Agent.

VI. Summary and Future Outlook

In this article, we systematically introduce how to build a simple LLM Agent with general intelligence potential using less than 100 lines of Python code. From theoretical explanation to code practice, we emphasize the following three key points:

Common code execution mechanism

By using core functions such as exec_python_code, the model is given the ability to call and execute any Python code. This design breaks through the limitation of traditional LLM that can only generate text, giving the Agent real "hands-on" capabilities and achieving a wider range of task processing.

Fine-tuned prompt strategy

By designing precise and detailed prompts, we can guide the model to clarify the task objectives, reasonably decompose complex tasks, and accurately generate tool call instructions and parameters. Good prompt engineering not only improves the accuracy of model generation decisions, but also greatly reduces the risk of "hallucination" and invalid execution of the model.

Complete interactive loop mechanism

The agent and external tools exchange information through continuous dialogue and feedback loops, constantly updating and iterating the task context. Relying on this closed-loop interaction, the agent can steadily advance the task process and effectively handle complex multi-step tasks.

The integration of the above mechanisms not only reflects the current technical route for building efficient LLM Agents, but also lays a solid foundation for building Agents with more general intelligence in the future.

Future exploration directions:

Strengthening security and isolation mechanisms

The currently implemented Agent can execute arbitrary code, which is flexible but also has security risks. How to build a safer code sandbox, identify and prevent malicious code calls will become one of the core issues that must be overcome in the future Agent development process. This requires the design of strict permission control, resource restrictions and exception capture mechanisms to prevent Agent capabilities from being abused or misused.

More tool extensions and deep integration

The ability of Agent will depend on the richness of the tools that can be called. In the future, it is possible to integrate rich tool modules such as database query, API service interaction, machine vision, speech recognition, and automated testing to further expand the potential for Agent to be applied in various industries and gradually realize true general intelligence.

Continue to optimize prompt and reasoning capabilities

Prompt engineering is one of the keys to the stable and efficient operation of Agent. With the continuous expansion of actual application scenarios, we can continue to optimize the design of Prompt based on task feedback data, so that the model can understand user intentions more accurately and perform decision-making and task decomposition more efficiently. In addition, we can also consider combining few-shot learning and chain-of-thought strategies to further improve the stability and reliability of model reasoning.

Last words

The birth of LLM Agent provides us with a technical route with great imagination. It combines the powerful language understanding ability of traditional models with the practical hands-on ability of external tools to create an infinite application space. I hope this article can inspire you to further explore and innovate, use limited code to create a more powerful, efficient and safe intelligent agent, and promote the real implementation of artificial intelligence in more scenarios and benefit more people.

PS: Most of this article is generated by a large model