10,000-word article: Explain the past and present of MCP + RAGFlow integrated application example

Written by
Iris Vance
Updated on:July-09th-2025
Recommendation

In-depth analysis of the MCP protocol and its applications, exploring the transformation of large model API calls.

Core content:
1. The development history and role of the MCP protocol
2. Practical examples of combining the RAG model with MCP
3. Analysis of large model API calls and multi-round dialogue technology

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

In the last article, I gave you a preview of my research on RAG+MCP (Large Model Context Protocol). I wrote it on and off for four days and finally finished this article. This article attempts to make two things clear:

1. The process of change from complex prompt words guiding the model to call the tool to MCP as a unified protocol standard;

2. In this trial demonstration, based on the traditional RAG, some functional extension examples of MCP are combined for machining scenarios.

Below, enjoy:

1

   

Let's talk about the big model API call first

Let's first briefly review the simplest large-model basic chat application development, which is to directly make requests according to the official API documentation of the target LLM. For example, if we want to call the DeepSeek-R1 model for question and answering through Python, the example is as follows according to the official documentation:

from openai import OpenAI
client = OpenAI(api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com")
response = client.chat.completions.create(    model="deepseek-chat",    messages=[        {"role": "system", "content": "You are a helpful assistant"},        {"role": "user", "content": "Hello"},    ],    stream=False)
print(response.choices[0].message.content)

Because most model manufacturers are compatible with OpenAI specifications, that is to say, when using the OpenAI SDK request method, directly replacing the above base_url with other model addresses can achieve request responses. For example, if you want to request the Qwen series model, then the base_url is replaced with:

 https://dashscope.aliyuncs.com/compatible-mode/v1  . Furthermore, if you want to implement multi-round dialogue and let the large model "have memory" to meet the questioning, taking Ali's QWQ model as an example, you can add the content field to the context through {'role': 'assistant', 'content': the concatenated streaming output content} . (Note: There is no need to add the reasoning_content field)

2

   

The Origin of Complex Prompt Word Project

The chat application mentioned above is only based on the existing knowledge of LLM, and the answer effect does not meet expectations in some scenarios. After all, the knowledge acquired by LLM is frozen at the moment of training. This obviously cannot answer questions that are time-sensitive or, more precisely, beyond the time before LLM training is completed, including private information, etc.

In short, by introducing external tools, LLM can interact with the outside world. OpenAI officially launched the Function  Calling feature in June 2023, which was initially implemented on the GPT-3.5 and GPT-4 models. However, before officially introducing this Function Calling feature, let's talk about the previous attempt to guide the model to call tools through complex prompt words.

Taking  the two tools get_current_weather  and  get_current_weather  as examples, the following is a comparison of the relevant implementation methods.

# System prompt word designSYSTEM_PROMPT = """You are an intelligent assistant with the following tools:
1. Time tool (TIME)   - Format: TIME: Get the current time   - Function: Returns the current full date and time
2. Weather Tools (WEATHER)   - Format: WEATHER: city name   - Function: Get the current weather conditions of the specified city
3. Calculation tool (CALCULATE)   - Format: CALCULATE: mathematical expression   - Function: Perform mathematical calculations
Important rules:- Must strictly follow the above format- Use tools only when you really need them- Tool calls should be the only thing output- After using the tool, you also need to provide a natural language explanation of the results
Example:User: What time is it in Beijing today?Helper: TIME: Get the current time
User: I want to know the weather in BeijingAssistant: WEATHER: Beijing
User: Help me calculate 15 times 23Helper: CALCULATE: 15 * 23"""
# Simulation tool implementationdef mock_tool_execution(tool_call):    import datetime        if tool_call.startswith("TIME:"):        return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")        elif tool_call.startswith("WEATHER:"):        city ​​= tool_call.split("WEATHER:")[1].strip()        #Simulated Weather API        return f"{city}: sunny, temperature 22°C, humidity 45%"        elif tool_call.startswith("CALCULATE:"):        expression = tool_call.split("CALCULATE:")[1].strip()        try:            return str(eval(expression))        except Exception as e:            return f"Calculation error: {str(e)}"        return "Unrecognized tool call"
# Dialogue interaction simulationdef chat_interaction():    print("Smart Assistant has started. Enter 'exit' to end the conversation.")        while True:        user_input = input("User: ")                if user_input == 'exit':            break                # Simulation model processing        # In actual scenarios, this will be the processing of a large language model        tool_match = None                # Simple rule matching        if "time" in user_input:            tool_match = "TIME: Get the current time"        elif "weather" in user_input:            city_match = user_input.replace("天气", "").strip()            tool_match = f"WEATHER: {city_match}" if city_match else None        elif "calculate" in user_input or "*" in user_input or "+" in user_input:            # Try to extract the calculation expression            import re            match = re.search(r'(\d+\s*[\+\-\*\/]\s*\d+)', user_input)            if match:                tool_match = f"CALCULATE: {match.group(1)}"                if tool_match:            print("Helper:", tool_match)            result = mock_tool_execution(tool_match)            print(f"Tool execution result: {result}")        else:            print("Assistant: Sorry, I didn't find the right tool to handle your request.")
# Run Interactionif __name__ == "__main__":    chat_interaction()

The example runs as follows:

Smart Assistant has been started. Type 'exit' to end the conversation. User: What time is it now Assistant: TIME: Get the current time Tool execution result: 2024-03-18 16:45:30 User: What's the weather like in Beijing Assistant: WEATHER: Beijing Tool execution result: Beijing: Sunny, temperature 22°C, humidity 45% User: Calculate 15 times 23 Assistant: CALCULATE: 15 * 23 Tool execution result: 345 User: Exit

In summary, this approach of complex prompt words mainly includes the following four characteristics:

  • Specify the output format in the system prompt

  • Use specific delimiters and markers (such as XML tags)

  • Extract "function calls" from model output using regular expressions

  • Include tool usage examples in conversation history

The limitation of this approach is that the format is unreliable, and large models may not follow instructions consistently. After all, the underlying logic of Transformer is still Next token prediction. Secondly, it is even more unreliable for complex parameter structures. Of course, there are other problems such as difficulty in parameter verification and occupying a large amount of prompt word space, which I will not go into detail one by one.

However, it is necessary to justify this practice. For simple scenarios, this type of prompt word method can be considered. It has low implementation cost and does not rely on specific model capabilities, but at the same time, one must always be vigilant about the uncertainty of the model output.

3

   

Function Calling is showing its potential

The above-mentioned attempt to use prompt words to guide the model to call the tool was standardized by OpenAI and provided as part of the API, which is what we now call Function Calling.

3.1

   

Innovation

Compared with the method of using prompt words to guide the model to call tools, the main innovations are reflected in the following points:

Structured output:

Ensure that the model can output function call parameters in a predefined JSON format, improving parsability and reliability

The function is clearly defined:

Clearly define names, descriptions, parameter types, etc. through function schema

Reduce parsing complexity:

Developers do not need to write complex text parsing logic

Improved accuracy:

Reduced the "hallucination" problem of the model when generating function calls

Simplify the development process:

Standardizes how large models interact with external tools

BTW, not all LLMs support Function Calling, because models that support Function Calling usually require special training or fine-tuning. Specifically, it is necessary to introduce examples of function calls during the pre-training or fine-tuning phase, train the model to understand the function schema, learn how to generate json output in the expected format, and understand parameter types and constraints.

3.2

   

Implementation Mechanism

What are the specific key mechanisms for selecting and calling tools for large models? Here is a simple example of the Qwen model. The specific implementation is roughly divided into two steps:

1. Intention Reasoning and Tool Matching

  • The big model will semantically understand the descriptions and parameters of the available tools based on user input.

  • The model will use natural language understanding and reasoning to determine whether a tool needs to be called and which specific tool to call.

This process is automatic and does not require developers to manually hard code each judgment logic.

2. Intelligent parameter filling

  • The model is able to extract necessary parameter information from the conversation context

  • For functions that require specific parameters, the model can intelligently fill in those parameter values

import osfrom openai import OpenAI
# Initialize the OpenAI client and configure the Alibaba Cloud DashScope serviceclient = OpenAI(    # If no environment variables are configured, please use Bailian API Key to replace the following line: api_key="sk-xxx",    api_key=os.getenv("DASHSCOPE_API_KEY"), # Read the API key from the environment variable    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",)
# Define a list of available toolstools = [    # Tool 1 Get the current time    {        "type": "function",        "function": {            "name": "get_current_time",            "description": "Useful when you want to know what time it is.",            "parameters": {} # No parameters required        }    },      # Tool 2 Get the weather of a specified city    {        "type": "function",        "function": {            "name": "get_current_weather",            "description": "Useful when you want to check the weather in a specific city.",            "parameters": {                  "type": "object",                "properties": {                    "location": {                        "type": "string",                        "description": "Cities or counties, such as Beijing, Hangzhou, Yuhang District, etc."                    }                },                "required": ["location"] # Required parameter            }        }    }]
# To save space, the subsequent code is omitted here

For example, if the user asks:

"What time is it now?" → The model might  call get_current_time

What is the weather like in Beijing today?" → The model may call  get_current_weather and automatically fill in  location="Beijing"

In summary, the key implementations include:

  • Tool description provides semantic clues

  • Parameter definition (parameters) guides parameter filling

  • Model’s contextual understanding and reasoning capabilities

Although it seems powerful, Function Calling relies heavily on the ability to understand the model, and the quality of the tool description directly affects the accuracy of the call, so there is also the possibility of misunderstanding or incorrect calling.

3.3

   

Function Calling vs Traditional Hard Coding

Seeing this, some friends may wonder what the difference is between Function Calling and the traditional manual function call external tool implementation logic. Referring to the two tools mentioned above, here is an example of the traditional hard-coded approach:

import datetimeimport requests
def get_current_time():    """manually retrieve current time"""    return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
def get_current_weather(location):    """manually fetch weather for a specific location"""    # Simulate weather API calls    try:        # Assume this is a real weather API call        # In actual scenarios, you need to replace it with real API endpoints and processing logic        response = requests.get(f"https://weather-api.example.com/current?city={location}")        weather_data = response.json()        return f"Current weather at {location}: {weather_data['temperature']}°C, {weather_data['condition']}"    except Exception as e:        return f"Failed to get {location} weather: {str(e)}"
def process_user_query(query):    """Manual linear trigger function"""    # Using hard-coded conditionals    if "time" in query:        return get_current_time()    elif "weather" in query:        # Additional parameter extraction logic required        import re        location_match = re.search(r'([\u4e00-\u9fff]+)天气', query)        if location_match:            location = location_match.group(1)            return get_current_weather(location)        else:            return "Please provide a specific city name"    else:        return "Unable to understand your request"
# Simulate user interactiondef main():    while True:        user_input = input("Please enter your question (enter 'exit' to end):")        if user_input == 'exit':            break                response = process_user_query(user_input)        print("System reply:", response)
if __name__ == "__main__":    main()

Example running results:

# Possible interaction examplesPlease enter your question (type 'exit' to end): What time is itSystem reply: 2024-03-18 15:30:45
Please enter your question (enter 'exit' to end): How is the weather in Beijing?System reply: Current weather in Beijing: 15°C, sunny
Please enter your question (type 'exit' to end): Exit

This example clearly demonstrates the limitations of the traditional manual triggering method: complex code logic, poor scalability, and limited ability to understand user input. In contrast, Function Calling provides a smarter and more flexible tool calling paradigm.

4

   

GPTs attempt to unify

Before we talk about MCP, we need to talk about the story of GPTs. The Function Calling mentioned above certainly has many advantages, but it seems a bit troublesome to write everything manually, so the smart OpenAI introduced the concept of GPTs for the first time at the developer conference in early November 2023. Users do not need any programming knowledge, they can easily customize their own personalized GPTs through natural language instructions and uploading knowledge bases. ps This move also has a practical effect of letting developers around the world actively develop Function Calling tools for ChatGPT.

Ignoring the subsequent bloody palace fighting episode, GPT Store was officially released to the public in January 2024. The data I found showed that as of January 2024, the number of GPTs created by users was said to have exceeded 3 million.

I remember when I released my first GPTs on November 11, 2024, I saw that there were more than 4,000 GPTs navigation sites. I was impressed that new content was released so quickly within a week of its release. I never thought that it would explode in popularity later.

The following shows an example of getting the current time in GPTs and explains its working mechanism.

# 1. manifest.json{    "schema_version": "v1",    "name_for_human": "Time Assistant",    "name_for_model": "time_tool_gpt",    "description_for_human": "A GPT that can retrieve current time information across different time zones.",    "description_for_model": "You are a helpful time assistant capable of retrieving current time information precisely and efficiently.",    "auth": {        "type": "none"    },    "api": {        "type": "openapi",        "url": "https://your-domain.com/openapi.yaml"    },    "logo_url": "https://example.com/time-assistant-logo.png",    "contact_email": "support@example.com",    "privacy_policy_url": "https://example.com/privacy"}
# 2. openapi.yamlopenapi: 3.0.0info:  title: Time Assistant API  version: 1.0.0  description: API for retrieving current time informationservers:  - url: https://your-domain.com/api/v1
paths:  /current-time:    get:      summary: Get current time      operationId: getCurrentTime      parameters:        - name: timezone          in: query          required: false          schema:            type: string            default: "UTC"          description: Timezone to retrieve current time (eg, 'UTC', 'America/New_York', 'Asia/Shanghai')      responses:        '200':          description: Successful time retrieval          content:            application/json:              schema:                type: object                properties:                  current_time:                    type: string                    example: "2024-03-18T15:30:45Z"                  timezone:                    type: string                    example: "UTC"                  timestamp:                    type: integer                    example: 1710770445

4.1

   

Core file analysis

1. manifest.json

Defines basic metadata for GPT

Specifying API interaction methods

Provides human and model readable descriptions

Configure authentication and API access points

2. openapi.yaml

Define specific API interface specifications

Describes available endpoints, parameters, and response structures

Provides tool call interface conventions

4.2

   

Example usage scenarios

"What time is it now?", "What time is it in Beijing now?", "Tell me the current time in New York"

Based on these questions, GPT will:

Identifying time query intent

Select /current-time interface

Pass in the corresponding timezone parameter

4.3

   

limitation

GPTs is completely bound to the OpenAI ecosystem and is not compatible with other large model platforms (Anthropic, Google, etc.). It cannot be ported and used across platforms. In terms of functional implementation, the tool calling method is highly standardized, lacks a flexible parameter passing mechanism, does not support complex state management, and cannot implement complex workflow orchestration. In addition, from the perspective of hosting, it is completely dependent on the OpenAI cloud platform, does not support private deployment, and cannot be deeply customized or redeveloped.

In summary, GPTs is essentially a closed and restricted ecosystem, suitable for quickly building simple interactive tools, but its limitations are very obvious for scenarios that require high customization and flexibility. For enterprises and developers who pursue technological autonomy and control, GPTs is not a good choice.

5

   

The accidental and inevitable launch of MCP

The Model Context Protocol (MCP) was released by Anthropic in November 24. To sum up, there are two main points. First, it is a protocol compatible with all AI applications. In addition, it adds resource and prompt word functions based on Function Calling.

5.1

   

Tool call example

Let's first look at the MCP implementation tool call example, here we take the get_current_time tool as an example:

User question: What time is it now?

MCP execution process:

1. Request phase:

{ "messages": [ {"role": "user", "content": "What time is it now?"} ], "context": { "tools": [ { "type": "function", "function": { "name": "get_current_time", "description": "Useful when you want to know the current time.", "parameters": {} } }, {...} // Another tool definition ] }}

2. Model response stage

{ "response": { "model": "DeepSeek-R1:32b", "content": [ { "type": "tool_use", "id": "call_1", "tool_use": { "name": "get_current_time", "parameters": {} } } ] }}

3. Tool execution phase: The system executes the get_current_time function and returns the result

{ "messages": [ {"role": "user", "content": "What time is it now?"}, { "role": "assistant", "content": [ { "type": "tool_use", "id": "call_1", "tool_use": { "name": "get_current_time", "parameters": {} } } ] }, { "role": "tool", "tool_use_id": "call_1", "content": {"time": "2025-03-18T14:30:00Z"} } ]}

4. Final Response

{ "response": { "model": "DeepSeek-R1:32b", "content": [ { "type": "text", "text": "The current time is 22:30 Beijing time on March 18, 2025 (14:30 UTC)." } ] }}

5.2

   

Execution Steps Breakdown

User query analysis:

The model analyzes the user's problem and determines which tool needs to be called

For tools that require parameters, extract the necessary parameters (such as city name)

Tool call request generation:

The model generates standardized tool call requests, including tool names and parameters

Each tool call has a unique ID for easy tracking

Tool execution:

The system receives a tool call request

Execute the corresponding function and get the result

Return the results in a standard format, associated with the corresponding tool call ID

Results integration and response generation:

Model receiving tool call results

Putting the results into context

Generate natural language responses to users

5.3

   

MCP Resource Access Control

The resource access control function of MCP can accurately define how large models can safely access internal enterprise resources. Take the example of a machining enterprise deploying a RAG system to access a MySQL database as an example to demonstrate the effect:

{ "messages": [ {"role": "user", "content": "Query the company's round steel hardness standard"} ], "context": { "resources": [ { "type": "database", "id": "materials_db", "config": { "engine": "mysql", "connection": { "host": "192.168.1.100", "database": "materials_specs", "schema": "standards" }, "permissions": ["read"], "tables": ["material_standards", "quality_tests"], "access_level": "restricted" } } ], "tools": [ { "type": "function", "function": { "name": "query_database", "description": "Query the company's round steel hardness standard", "parameters": { "type": "object", "properties": { "resource_id": {"type": "string"}, "query_type": {"type": "string", "enum": ["standard", "test", "material"]}, "material_category": {"type": "string"}, "standard_type": {"type": "string"} }, "required": ["resource_id", "query_type"] } } } ] }}

Execution process

Permission verification layer : The system first verifies the current user's access rights to the material database

Resource boundary definition : MCP restricts access to specific tables (material standards and quality tests)

Query restriction : Only read operations are allowed, no writing or modification is allowed

Data filtering : You can set it to return only non-confidential standard data

5.4

   

Practical application advantages

Data isolation : ensure that LLM can only access authorization forms and cannot see sensitive data such as customer data and financial information

Audit trail: logs all database queries to facilitate compliance audits

Fine-grained control: different data access scopes can be set for users in different departments

Prevent SQL injection: MCP can validate and sanitize query requests to enhance security

5.5

   

Prompt word enhancements

The MCP prompt word enhancement function can provide LLM with additional context and professional guidance, enabling it to better handle professional field issues. Here is also an example of a mechanical processing company:

{ "messages": [ {"role": "user", "content": "How to troubleshoot the S-02 error code in the CNC machining center?"} ], "context": { "prompts": [ { "id": "mechanical_expertise", "content": "You are a professional mechanical processing technical consultant who is familiar with CNC equipment operation and troubleshooting. Always prioritize safety factors and follow ISO 9001 quality standard processes. Before answering, confirm the severity of the error, and then organize your answer in the order of 'preliminary inspection → possible causes → solution steps → preventive measures'." }, { "id": "company_terminology", "content": "Use company standard terminology: 'machining center' refers to HAAS VF-2SS equipment, 'S series error' refers to spindle system failure, and 'technical processing form' refers to the fault report form in the company's internal ERP system. Use the 'HC-' prefix when referring to the equipment code." }, { "id": "safety_guidelines", "content": "Any equipment maintenance proposal must include the following safety reminder: 'Before maintenance, ensure that the equipment is completely de-energized and locked out, wear appropriate PPE equipment, and follow the maintenance procedures specified in the company's MP-S-001 safety manual.'" } ], "resources": [ { "type": "vector_database", "id": "equipment_manual_db", "config": { "collection": "equipment_manuals", "filters": {"equipment_type": "CNC"} } } ] }}

5.6

   

Application scenarios and advantages

Unification of professional terms

The system understands and uses company-specific terminology (e.g. "HC-1500" refers to a specific model of machining center)

Ensure answers comply with corporate standards and naming conventions

Process Standardization

Force the model to answer questions according to the company's process standards

For example: CNC fault diagnosis must first check the safety status, and then conduct electrical and mechanical system troubleshooting

Knowledge Fusion

Combining industry-general knowledge with company-specific knowledge

For example: combining ISO standards with internal process specifications

Security and Compliance

Automatically add security warnings for high-risk operations

For example: When processing special materials, special protective measures and environmental protection requirements are required

5.7

   

Specific application examples

Mechanical equipment fault diagnosis

Prompt word enhancement: Equipment failure answers must include: Fault code explanation, possible causes (sorted by probability), diagnostic steps (from simple to complex), list of tools required for repair, and estimated repair time

Process parameter recommendations

Prompt word enhancement: When answering questions about material processing parameters, you must consider the limitations of the company's existing equipment capabilities and refer to historical successful cases in the company's process database

Quality control assistance

Enhanced prompt words: When analyzing quality issues, automatically associate enterprise SPC data, quote relevant ISO standard clauses, and recommend applicable enterprise standard test methods

6

   

RAGFlow and MCP integration example

In order to achieve access to multi-source data in an enterprise, here is a demonstration of the initial practice in a recent project to integrate system data such as MySQL and EMS into the RAGFlow framework through MCP for reference.

6.1

   

Architecture Design

┌───────────────────────────────────────────────┐│ RAGFlow + MCP integration architecture │├───────────────────┬─────────────────────────────────┤│ │ LLM call layer ││ ├─────────────────────────────────┤│ Original RAGFlow │ Vector retrieval layer ││ ├───────────────────────────────┤│ │ Document processing layer │├─────────────────┼───────────────────────────────┤│ │ MCP adapter layer ◄──────────┼──► MCP protocol definition│ ├───────────────────────────────┤│ MCP extension part │ Resource access control layer ││ ├─────────────────────────────────┤│ │ Multi-source data connection layer │└─────────────────┴─────────────────────────────────┘ ▲ ▲ ▲ │ │ │ ┌────────┘ │ └────────┐ │ │ │ ┌────────┐ ┌────────┐ ┌────────┐ │ MySQL │ │ EMS │ │ Other systems│ └────────┘ └────────┘ └────────┘

6.2

   

Reference Cases

Scenario: Engineers query equipment failures and related maintenance history

(This is an advertisement. For relevant sample source codes, please go to my Knowledge Planet)

1. User query

"The CNC lathe HC-2000 has been reporting E-35 errors frequently recently. Are there any related maintenance records?"

2. System processing flow

MCP request generation


{  "response": {    "model": "DeepSeek-R1:32b",    "content": [      {        "type": "tool_use",        "id": "call_1",        "tool_use": {          "name": "query_equipment_status",          "parameters": {            "resource_id": "production_db",            "equipment_id": "HC-2000",            "query_type": "maintenance_history"          }        }      }    ]  }}

Database query execution: The system queries the HC-2000 maintenance record tool execution results from the MySQL database

{ "data": [ { "date": "2025-02-15", "issue": "E-35 error: spindle encoder signal abnormal", "solution": "Replace the spindle encoder and adjust the encoder gap" }, { "date": "2025-01-20", "issue": "E-35 error: occurs intermittently", "solution": "Clean the cable connector and re-secure the connector" } ]}

Vector search supplement: The system also searches the vector database for device manuals and technical documents related to "E-35 error"

Final response generation: The system integrates database query results and vector search results to generate a complete answer

User gets answer

According to the maintenance records, the HC-2000 CNC lathe has had two E-35 errors (abnormal spindle encoder signal):
1. February 15, 2025: Replaced the spindle encoder and adjusted the encoder gap2. January 20, 2025: Cleaned the cable connector and re-secured the connector
According to the equipment manual, the E-35 error is usually caused by the following reasons:- Encoder cable is damaged or has poor contact (70% of cases)- Faulty encoder itself (about 20% of cases)- Spindle drive circuit problem (about 10% of cases)
The following steps are recommended:1. Check if the encoder cable is worn or loose.2. Measure the resistance value between the cable connector and the encoder interface (should be <0.2Ω)3. Check the encoder signal quality using an oscilloscope
According to the company's SOP-M1001 standard, such issues require filling out the "Equipment Abnormality Tracking Form" and the maintenance department to evaluate whether the preventive maintenance plan needs to be adjusted.

7

   

Integration technology points

Keep the design modular

Add the MCP adapter as a standalone module without disrupting existing RAGFlow core functionality

Use the adapter pattern to connect RAGFlow and enterprise data sources

URI

Create a unified resource access interface for all data sources (vector library, MySQL, EMS, etc.)

Ensure consistent access control

Query routing mechanism

Develop intelligent query routing to automatically decide whether to use vector retrieval or structured data query

Supports hybrid query mode, integrating results from multiple data sources

Caching and performance optimization

Cache frequently queried database results

Use batch processing to reduce database connection overhead

Deployment considerations

Ensure that the database connector is correctly configured in the enterprise intranet environment

Dealing with connectivity issues in isolated network environments


8

   

Some thoughts on the evolution of LLM tool calling technology

The technical evolution path from complex prompt words → Function Calling → GPTs plug-in → MCP protocol is very similar to the development process of several classic technology standardizations, especially the process of Web technology, Internet protocol and API standardization. Using the Web API standardization process as an analogy,

Web API Evolution
Evolution of AI tool calls
Early days: RPC (Remote Procedure Call) implemented separately
Early days: Tool calls in the prompt word project
Development stage: frameworks such as SOAP and XML-RPC
Development stage: JSON structured calls of Function Calling
Maturity: REST APIs become mainstream
Mature stage: GPTs and other platform-specific implementations
Unification period: GraphQL and other new generation standards
Unification Period: MCP as a Unified Protocol

By observing the development of LLM tool calls and related technical standards, it seems that we can summarize the general rules of technical standardization:

Exploration period:

Each party solves the problem in its own way (cue word engineering)

Initial standardization:

Function Calling

Platform-dominant period:

Major platforms launch their own solutions (GPTs plugins)

Open Standard Period:

Form a unified open standard (MCP protocol)

From historical experience, the emergence of unified standards usually brings about explosive growth in the technology field, such as the Internet boom after the unification of Web standards. The emergence of the MCP protocol is likely to mark the entry of the LLM tool ecosystem into a new stage of rapid development and widespread application.

After 25 years, it is worth looking forward to more best practices in vertical scenarios.