Deep understanding of the MCP principle of large models

Written by

Jasper Cole

Updated on:June-28th-2025

1. What is MCP?

Recently, there have been a lot of discussions about MCP (Model Context Protocol), and many people are paying close attention to this new thing. This is mainly because the iteration of large model application technology is very fast at this stage. Many companies and related practitioners do not want to miss some key technology upgrades and cause their applications to be "lagging behind", so many people are catching up with these hot spots. But in fact, from the perspective of technology research and development, MCP is just an iterative upgrade of Tool Calling. The most intuitive feeling is that the code for calling external services in Tool Calling was written on the large model application side. After the introduction of MCP, the code for calling external services is written on the MCP Server side, which can be used as an independent process to provide services. Large model applications can interact with MCP Server through MCP Client to achieve the purpose of calling external services. Simply put, the previous Tool Calling is split into two parts, an MCP Server and an MCP Client. So how do MCP Client and MCP Server interact after the split? Then we have to rely on the MCP protocol. In fact, the most important thing about MCP is to stipulate the interactive data format between MCP Client and MCP Server, so that the large model application side that includes MCP Client does not need to adapt the interface format once for each tool to be called - of course, this is because MCP Server is carrying a heavy burden, and the task of adapting external services falls on MCP Server.

So what should the transmission format between MCP Client and MCP Server be like, so that the interaction between them can be universal without specific adaptation - that is, only MCP Server needs to adapt the external interface, and MCP Client does not need to modify any code. The interaction between them adopts the transmission data format of JSON-RPC 2.0. The data format defines the method name to be accessed, the parameters to be called, and its description. After MCP Client sends this information to MCP Server, it is enough for MCP Server to call the corresponding service. Therefore, if the large model application using MCP Client wants to use various new service functions, it only needs to work hard to iterate the functions on the MCP Server side. The MCP Client side does not need to be changed at all. This achieves excellent decoupling in engineering, allowing the large model application side to focus on writing various business codes, while the MCP Server side focuses on writing various tool services. This is the biggest difference between MCP and Tool Calling in engineering, as shown in the following figure:

From the above, we can see that the MCP protocol mainly brings us engineering decoupling, just like the evolution from JSP to front-end and back-end decoupling in Web engineering. This engineering decoupling has indeed promoted the accelerated evolution of the front-end and back-end in their respective ecosystems, so the emergence of MCP is of course a good thing. However, for the capabilities of large model applications, what can be done through MCP can also be done using Tool Calling - it just means that the access to functions is simpler through the existing MCP Server. Therefore, you can ignore the various exaggerated descriptions in many articles and videos. Those claims that MCP can improve the experience level of large model applications are not credible.

Of course, MCP not only brings engineering progress, its emergence also represents that some SaaS service providers can really enjoy the dividends of the implementation of large model technology, and there is a dawn of profitability in this field. I also said in the article " Deconstruction and Reorganization of Existing Services by Large Model Applications " written in 2024 that in the future, the entrance of user traffic is very likely to change from various APPs to large model RAG applications, and the rise of RAG will make segmented Restful API services more and more important. The MCP service packages this type of Restful API service into an out-of-the-box toolkit, making the monetization channels of this type of API smoother. Therefore, we can see that after the emergence of MCP, applications such as Amap and Alipay quickly launched their exclusive MCP services. After all, businessmen can sniff out potential business opportunities the fastest.

2.Building MCP Services

The construction of MCP Server can be realized with the help of Spring AI framework, which makes the construction of MCP service easier and more convenient. The built services can be used directly with MCP clients such as Cherry Studio, or directly with Spring AI's MCP Client. Spring AI has also released the official 1.0.0 version in recent days, and it is quite good to use this framework for development. First, Maven needs to introduce some necessary Spring AI packages:

<!-- Introduce the latest 1.0.0 package of Spring AI framework --><dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>1.0.0</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies></dependencyManagement> <!-- After introducing this package, MCP SERVER based on SSE connection can be supported --><dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-mcp-server-webmvc</artifactId> </dependency></dependencies>

After importing the corresponding package, you can create the corresponding MCP service function based on the framework. The following are two examples of tool methods, one is a method to obtain the local weather by the name of the city, and the other is a method to obtain the position of an employee by his name.

import org.springframework.ai.tool.annotation.Tool;import org.springframework.ai.tool.annotation.ToolParam;import org.springframework.stereotype.Service; @Service public class ToolService { @Tool(description = "Get the weather conditions according to the city name") public String getWeather(@ToolParam(required = true,description = "City Name")String cityName) { System.out.println("Get the weather conditions according to the city name"+cityName); if (cityName.equals("Zhaoqing")) { return "Sunny"; } else if (cityName.equals("Shenzhen")) { return "Raining"; } return "Don't know"; } @Tool(description = "Get the position information according to the employee name") public String getPosition(@ToolParam(required = true,description = "Employee Name")String name) { System.out.println("Get the position information according to the employee name"+name); if (name.equalsIgnoreCase("Arain")) { return "Data Mining and Algorithm Engineer"; } else if (name.equalsIgnoreCase("Judy")) { return "Product Manager"; } return "Don't know"; }}

You can see in this class that each MCP method is annotated with Tool, which allows the Spring AI framework to recognize the method we wrote and perform conversion processing. The description field in the Tool annotation is used to describe the purpose of the method. These descriptions will be received by the MCP Client and sent to the big model by the big model application, so that the big model can determine which method should be used to meet user needs based on these descriptions. The ToolParam annotation is used to describe method parameters, indicating the meaning of the parameters, so that the big model can better determine which parameters should be used. After writing the above Server class, you only need to further initialize the tools in this class and provide ToolCallbackProvider:

@SpringBootApplicationpublic class McpServerApplication { public static void main(String[] args) { SpringApplication.run(McpServerApplication.class, args); } @Bean public ToolCallbackProvider initTools(ToolService toolService) { return MethodToolCallbackProvider.builder().toolObjects(toolService).build(); }}

At this point, the MCP service has been written. The whole is actually a Spring Boot application. So how do you use these two tools? Here, you can use the Cherry Studio tool directly for a more intuitive experience. Configure an MCP server in Cherry Studio, and then fill in SSE as the type, and use the URL of the Spring Boot application that has just been started. The default path is /sse. After configuration, the two tools that you just wrote will appear in the "Tools" column.

3. MCP dialogue process

After configuring the MCP server, you can use it in the conversation. In the conversation, the big model will provide a reply based on the user's question, whether the tool needs to be called, and the specific request parameters for calling these tools. When the big model application obtains the method information for calling these tools from the big model, it will send a request to call the tool to the MCP Server through the MCP Client. When the big model application obtains the content returned by the tool, it will carry this content and request the big model again, so that the big model can give a final content. Therefore, in the big model application with MCP function, the big model service will be requested twice. One is to ask the big model to give whether the tool needs to be called and the specific parameters for calling the tool. The second time is to ask the big model to give the final answer based on the result of the tool call. For example, in the screenshot below, I asked " Please answer my question about the weather in Shenzhen and Zhaoqing, and the occupations of Arain and Judy ". From the question and answer page of Cherry Studio, it can be seen that it has executed 4 tool calls, and before and after these 4 tool calls, there are actually two big model calls.

When calling the big model for the first time, in order to have a deeper understanding of what the big model application does through MCP, I will directly give the curl command parameters here. The most important thing here is the parameter configuration of the tools field, which describes in detail the calling parameters of the tools provided in the MCP Server. This is mainly to let the big model understand these parameters and give the correct tool calling method:

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \--header 'Authorization: xxxxxxxxxxxxxxx' \--header 'Content-Type: application/json' \--data '{ "model": "qwen-plus", "messages": [ { "role": "user", "content": "Please answer me the weather information of Zhaoqing and Shenzhen, as well as the job information of Arain and Judy." } ], "tools": [ { "type": "function", "function": { "name": "getWeather", "description": "Get the weather conditions according to the city name", "parameters": { "cityName": { "type": "string", "description": "City Name" } }, "required": ["cityName"] } }, { "type": "function", "function": { "name": "getPosition", "description": "Get the position information of an employee according to his name",          "parameters": { "name": { "type": "string", "description": "Employee Name" } }, "required": ["name"] } } ], "tool_choice": "auto", "parallel_tool_calls":true }'

After the big model receives this request, the returned content is as follows. It can be seen that the returned content is not text content like ordinary questions and answers, but the tool calling parameters. Note that the finished_reason of the big model is tool_calls. This is actually the output format that many big models have adapted when using Tool Calling before, and it is not something that only appeared after the emergence of MCP.

{ "choices": [ { "message": { "content": "", "role": "assistant", "tool_calls": [ { "index": 0, "id": "call_xxxxxxxxxxxxxxxx", "type": "function", "function": { "name": "getWeather", "arguments": "{\"cityName\": \"Zhaoqing\"}" } }, { "index": 1, "id": "call_xxxxxxxxxxxxxxxx", "type": "function", "function": { "name": "getWeather", "arguments": "{\"cityName\": \"Shenzhen\"}" } }, { "index": 2, "id": "call_xxxxxxxxxxxxxxxxxx", "type": "function", "function": { "name": "getPosition", "arguments": "{\"name\": \"arain\"}" } }, { "index": 3, "id": "call_xxxxxxxxxxxxxxxx",                        "type": "function", "function": { "name": "getPosition", "arguments": "{\"name\": \"judy\"}" } } ] }, "finish_reason": "tool_calls", "index": 0, "logprobs": null } ], "object": "chat.completion", "usage": { "prompt_tokens": 288, "completion_tokens": 70, "total_tokens": 358, "prompt_tokens_details": { "cached_tokens": 0 } }, "created": 1748011412, "system_fingerprint": null, "model": "qwen-plus", "id": "xxxxx-xxxxxx-xxxx-xxxx-xxxx-xxxxxxx"}

Then, based on the output content of the big model, the big model application converts it into the transmission format of JSON-RPC 2.0, and transmits it to the MCP Server through the MCP Client. The MCP Server then actually executes the relevant tool methods and returns the relevant content to the MCP Client. After the big model application obtains the tool return content, it will ask the big model again. Its curl parameters are as follows. It can be seen that when calling the big model service, the tool return content is attached to the messages through the tool role:

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \--header 'Authorization: xxxxxxxxxxxxxxx' \--header 'Content-Type: application/json' \--data '{ "model": "qwen-plus", "messages": [ { "role": "user", "content": "Please answer me the weather information of Zhaoqing and Shenzhen, as well as the job information of Arain and Judy." }, { "role": "assistant", "tool_calls": [ { "function": { "name": "getWeather", "arguments": "{\"cityName\": \"Zhaoqing\"}" } }, { "function": { "name": "getWeather", "arguments": "{\"cityName\": \"Shenzhen\"}" } }, { "function": { "name": "getPosition", "arguments": "{\"name\": \"arain\"}" } }, { "function": { "name": "getPosition", "arguments": "{\"name\": \"judy\"}" } } ] }, { "role": "tool", "content": "Sunny" }, { "role": "tool", "content": "Raining" }, { "role": "tool", "content": "Data Mining and Algorithm Engineer" }, { "role": "tool", "content": "Product Manager" } ], "tools": [ { "type": "function", "function": { "name": "getWeather", "description": "Get the weather conditions according to the city name", "parameters": { "cityName": { "type": "string", "description": "City Name" } }, "required": ["cityName"] } }, { "type": "function", "function": { "name": "getPosition", "description": "Get employee position information based on employee name", "parameters": { "name": { "type": "string", "description": "Employee name" } }, "required": ["name"] } } ], "tool_choice": "auto", "parallel_tool_calls":true }'

At this point, the big model will return the final answer. Note that finished_reason is stop at this point. The big model application side uses this to determine which type of model return content requires tool invocation and which type is the final model output.

{ "choices": [ { "message": { "content": "The weather in Zhaoqing is sunny, and the weather in Shenzhen is raining. Arain's job information is data mining and algorithm engineer, and Judy's job information is product manager.", "role": "assistant" }, "finish_reason": "stop", "index": 0, "logprobs": null } ], "object": "chat.completion", "usage": { "prompt_tokens": 393, "completion_tokens": 37, "total_tokens": 430, "prompt_tokens_details": { "cached_tokens": 0 } }, "created": 1748011592, "system_fingerprint": null, "model": "qwen-plus", "id": "xxxxx-xxxxxx-xxxx-xxxx-xxxx-xxxxxxx"}

At this point, we should understand the overall process and underlying principles of using MCP and Tool Calling in the entire large model application. In fact, large model applications, Tool Calling, MCP services, etc. are not as magical as everyone imagines. As long as you analyze their principles according to their calling process, you will eventually find that the application of AI is ultimately a combination and optimization of algorithms and engineering.