Vernacular MCP—Another hot concept in dismantling the AI ​​field

Written by
Audrey Miles
Updated on:June-09th-2025
Recommendation

Explore the new hot spots in the AI ​​field, MCP, and see through the technological essence and future trends behind it. Core content: 1. Popularization of MCP concepts and development direction of AI products 2. Language model defects and ways to enhance model capabilities 3. Application and operation permissions of agents in the virtual world

 
Yang Fangxian
53A founder/Tencent Cloud (TVP) most valuable expert

Have you heard of MCP (Model Context Protocol, Model Context Protocol)? This is my experience:

  • I found that the entire network is discussing MCP;
  • Search tutorial, overwhelmed by terms such as "server", "client";
  • Close the page without understanding.

If you don't care about technical details and want to understand the current situation and direction of AI product development, please read this article to understand Why is there a MCP thing ? I want to do this:

  • Try to avoid or fully explain professional nouns;
  • Not to make a bad example—using the kitchen or Type-C as a metaphor is not helpful for understanding it;
  • Explore the essence and development prospects of new concepts.

MCP background

Flaws of language model

You like to chat with AI very much. One day, you asked DeepSeek:

How many letters "r" are there in the word Strawberry?

Don't underestimate this question, not long ago, most language models could not answer correctly. If the technology cannot be broken through for the time being, will the problem be allowed to exist? Not necessarily. You should know that this kind of problem can be calculated with one line of code. If the AI ​​could write the code itself and run it, wouldn’t it count?

It can be seen that allowing AI to call external tools (such as running code) is one of the ways to enhance the model's capabilities at this stage.

In addition to tools (Tools), another "plug-plus" that can arm AI is resources (Resources).

  • Real-time resource. If you ask how much tariffs between China and the United States have been changed, it must read news and documents;
  • Personalized resources. You dozed off during a meeting, but if the office software comes with AI, it can help you summarize the meeting minutes.

This requires RAG (Retrieval Augmented Generation, Retrieval Augmented Generation). The most common RAG is of course online search.


Back to the beginning of this section: "You like to chat with AI very much." - Can AI only chat?

If you are a programmer, by letting the AI ​​read the code in your directory, you will save you from copying and pasting it into the chat box, can you further let the AI ​​help you write the code into the editor or even run it?

This not only requires the model to be able to read, but also requires it to be able to "operate". OK, in fact, we have accidentally introduced another important concept: Agent, as the name suggests, can not only chat, but also have permission to do things on your behalf. So it also has a cool name in the Chinese context: the agent.

The most common agents are AI programming tools, such as GitHub Copilot and Cursor. You can just chat and let the AI ​​write and run the code for you.

It can be seen that the agent is no longer just talking about it on paper, but can also take action. Of course, the scope of action is the virtual world, and it cannot help you smash your boyfriend's head with a pot.

Summary: Now the AI ​​based on language model is still a "cage bird", and its essence is a dialogue program that is blocked in its own world as soon as it is launched. It will no longer learn new knowledge, nor will it do anything other than dialogue. So we passed:

  • Link external resources to broaden their horizons (RAG);
  • Grants it operation permissions to achieve initiative (Agent).

Every language model is a code writer

After talking for so long, the language model can only chat? Just by the ability to output a paragraph, how does it operate your computer?

This question also bothered me, and later I thought of a common understanding: All this is all by adding a new role to AI - a programmer, and a pure writer who is only responsible for writing and not running.

We can use a practical example to explain the meaning of this passage.

There is a weather forecasting agent that can check the weather through dialogue. I asked it, "What weather is in Beijing tomorrow?" It replied, "Beijing will be sunny tomorrow, 15-30 degrees." What happened behind this?

  1. I entered the question into the agent (note that the language model itself has not been entered yet);
  2. The agent attached a sentence to my question: "This has a written program, and running it can return the weather forecast from the Meteorological Bureau";
  3. My question is entered into the language model along with this extra message;
  4. The model understands my intentions and also knows that there is a program to get a weather forecast;
  5. The model returns a request code that indicates that it needs to call the program, with parameters {Time: 20250530, Location: Beijing} ;
  6. The agent runs the program according to this parameter configuration and gets the result {Weather: Sunny, Minimum temperature: 15, Maximum temperature: 30} ;
  7. This result is sent to the model together with the previous question;
  8. The model takes the question and answer to polish it, and returns: "Beijing will be sunny tomorrow, 15-30 degrees."
  9. The agent sent me this sentence.

In this process, the language model was run twice:

  1. After the program is finished, convert its output result back to natural language.

This is the predecessor of MCP: Function Calling . Run the language model many times, and the model not only talks with the user, but also asks the agent to help it call the program.

The first time I figured out these, my thoughts were: That's it? Then we need to change the above statement "give AI operation permissions": Agencies are actually software. The software has permissions to allow the built-in program to run according to the model instructions. At this time, it is difficult to define whether the user is facing a language model or a new form of software that includes natural language processing functions. We will put this discussion below and let’s talk about MCP first.

The value of MCP

Now, let's assume a scenario like this:

  • You are a weather forecasting service provider, with forecast accuracy punches ECMWF, kicking Microsoft weather (implanted advertising);
  • Users can see the weather forecast by opening their phones and clicking a few times, but they like to obtain information by asking AI.

How do you let those who use AI to ask about the weather get the forecast you provide? There are two basic solutions:

  • Conversation with major AI manufacturers to show your unparalleled forecasting effect, so that they can build your forecasting service in their products;
  • Make a built-in AI weather forecasting software. You don’t have an AI model yourself, but you can purchase the services of AI manufacturers.

The former is not very realistic, while the problem with the latter is:

  • Users can only obtain your weather forecast from your software, but users are used to opening ChatGPT instead of your software;
  • You need to pay for the services of the AI ​​vendor;
  • The format of the AI ​​call request may vary from manufacturer to manufacturer or even version, and you need to modify the code frequently;
  • Your product is an isolated island and it is difficult to integrate into "pan-weather" services. For example, if users want to arrange their itinerary according to the weather, they also need functions such as calendars and maps.

How to solve these problems? Let's take a step back and think about what happened to the last "technology wave" - ​​the mobile Internet.

15 years ago, you were already a weather forecast service provider. What you hope users will see on your mobile phone is your weather forecast:

  • If the user holds a feature phone, you need to go to Nokia or Motorola to find your service in the product;
  • If the user holds a smart phone, you only need to develop an app to allow users who like you to install it.

What is the key to the second point? platform. Under the wave of AI, a similar beautiful vision is: Can we make AI a platform like iOS or Android, and regard "function calling" as an app on this platform? In this way, all problems can be solved:

  • When a user talks with an AI, choose the source of information for AI, such as your weather forecast service;
  • You only need to optimize weather forecasts, don't pay for AI services, or worry about how your forecasts are connected to AI;
  • The AI ​​side is responsible for integrating weather, calendar, map and other "Apps" to provide "pan-weather" services.

Principles of MCP

To achieve the vision mentioned in the previous article, one of the cores is the standardization of "the car is the same track, the book is the same text". For example:

  • How does App tell the software what functions it has and what parameters it needs?
  • How to ensure that the request issued by the software can successfully call the App?

Standardization is the purpose of the protocol. MCP,即Model Context Protocol, model context protocol, as the name suggests, regulates how these apps are used as contexts to enhance model capabilities.

The benefits of doing this are obvious-you really don't have to write several copies of code.

is not right yet... So far, the only difference between MCP and function calling is that Calling is implemented by various AI manufacturers themselves, and MCP wants to unify the world. But the problem mentioned above has not been solved - you still have to write all the code together to make it an "island". How to truly decouple technically, separate what the AI ​​platform needs to do and what the "App" needs to do, so that any AI can call your forecast?


Finally, we can talk about the core concepts of MCP - because we have actually mentioned them a long time ago:

  • Server (server) in MCP is responsible for providing resources and tools;
  • The Client (client) in MCP is responsible for sending messages to the model and executing the model's call request.

Think carefully, these two elements are actually in the previous text function There are also parts to calling, Why can we decouple them with MCP? Because of the agreement!

  • You are a weather forecast server. If the requests received are in a unified format, why bother sending it?
  • You are an AI model. If your request has a Client responsible for explaining, why bother to whom you end up implementing?

In this way, of course, each can write its own code and the user can assemble them. It can be seen that the two words "server" and "client" are quite vivid: the server provides standardized services; the client represents customer procurement services. This obviously also conforms to the computer field's understanding of these two concepts.

Practical runtime:

  • The user configures the server and sends the problem to the client;
  • The client sends the problem and the tool and resource information in the server to the AI;
  • AI answer client what tools and resources it needs;
  • The client calls these tools and resources through the server and sends the results to the AI;
  • AI is polished and returned to the user through the client.

At this time, as a weather forecasting service provider, you only need to write the MCP server code and then advertise to let users install your MCP server on the AI ​​client, just like now advertise to let users install your App on their mobile phones.

In short, the client's "middleman" is independent, and it is from the function The key shift in calling to MCP. It passes messages between AI and servers, and a "platform" is formed. Now, we can understand this picture:

Another concept is introduced here called "Host", which can be understood as software that comes with a Client. For example, the VS Code Code Editor is a Host, which can establish a Client that follows MCP. We understand that MCP does not need to strictly distinguish between Host and Client.

MCP Implementation

MCP was released last year by Anthropic, an AI company launched. Since I hope to unify the world, of course I have to do some work first. Anthropic implements a set of tools to help you develop MCP Client and Server. We can briefly check out the Server code examples they gave on the official website.

@mcp.tool()
async def   -> str:
          state: Two-letter US state code (e.g. CA, NY)
    """
    """
    ""
    ""
    ""
    ""
    ""
    ""
    ""
    ""
    "# A code that can read early warning information from the Meteorological Bureau

This function is the tool to be called. Enter the name of the US state and you can return to the weather warning issued by the Weather Service in the state. What you need to pay attention to is @mcp.tool() Such a "decorator". It can achieve at least two functions:

  • Tell the MCP framework that this server contains a get_alerts 's tool;
  • Automatically resolve function names, parameter lists and docstring ("document" in three quotes).

So, with a simple decorator, Model The Context is automatically generated, and the only thing you need to do is implement the core functions.

Regarding the actual effect of MCP, Microsoft Build recently held by Microsoft There is an example in 2025, you can watch this link: https://www.bilibili.com/video/BV1KjJFz3EL4?t=3955.3. Specifically:

  • Windows supports MCP, and can start the MCP Client to form the "Host with MCP Client" above.
  • Two servers are registered: WSL (a tool for running Linux in Windows) and Figma (an interface design tool).
  • In the code editor, a web page was created by chatting with AI:
    • A Linux environment is installed through WSL server;
    • Let the AI ​​write a simple web page in Linux;
    • Read the interface drawn in Figma through the Figma server, and then converts it from AI to code.

It can be seen that the concept of MCP provides great convenience for Windows that aims to achieve "intelligent" operating system level.

MCP limitations

MCP is not a technological breakthrough

First, two "thrilling theories" (but it is obvious):

  • MCP function Calling has no breakthrough. In terms of pure technical aspects, anything MCP can do before can be done;
  • MCP does not even deviate from the category of prompt engineering, and its essence of "let AI write code" has not changed.

For the first point, MCP is a specification rather than a technology. In a popular saying, it just optimizes M×N to M+N, that is, it only needs to implement M AI and N services and combine them freely, rather than writing a set of code for each combination of M×N AI and services.

As for the second point, the so-called prompt engineering is to use high-quality text input to make the AI ​​output ideal information. Let's take a look at "magic" in a client example provided by Anthropic on GitHub:

system_message = (
    "You are a helpful assistant with access to these tools:\n\n"
    f"{tools_description}\n"
    "Choose the appropriate tool based on the user's question. "
    "If no tool is needed, reply directly.\n\n"
    "IMPORTANT: When you need to use a tool, you must ONLY respond with "
    "{\n"
    "{\n"
    "{\n"
    '    "arguments": {\n'
'    "arguments": {\n'
"    "arguments": {\n'
   "      "arguments": {\n'
   '      "arguments": {\n'
   '        "argument-name": "value"\n'
    "    }\n"
    "}\n\n"
    "After receiving a tool's response:\n"
    "1. Transform the raw data into a natural, conversational response\n"
    "2. Keep responses concise but informative\n"
    "3. Focus on the most relevant information\n"
    "4. Use appropriate context from the user's question\n"
         style="color: #98c379;line-height: 26px;">"5. Avoid simply repeating the raw data\n\n"
    "Please use only the tools that are explicitly defined above."

From this prompt sent to AI, we clearly see the essence: Client merges the tool description into text to tell the AI, and requires the AI ​​to output the specified structured text (JSON), so as to parse it with written code to correctly call the program. Observe, this implementation assumes a premise: AI always executes the instructions we entered meticulously. Although writing code is one of the few relatively reliable capabilities of generative AI, traditional software still has overwhelming bugs. Introducing AI into the process will inevitably lead to another order of magnitude.

What's more, when we only have limited tools, everything is beautiful, but what if there are thousands of tools and iterate many times? Don't forget that the instructions for these tools are "Model Context", and the content of AI reading at one time is limited. This has also given rise to some thinking in the industry, such as classifying tools in hierarchical levels like folders.

Speaking of this, I can't help but want to ask a question: MCP is used to standardize communication between AI and computers and Internet tools. They are all "silicon-based life". Why do we need to communicate in human language?

The underlying logic of MCP is AI-centricism

Anyway, MCP is the best solution at present in the eyes of AI "believers" and has a reasonable storyline. But let’s jump out of this line and think about it?

You are still the weather forecast service provider, and MCP can connect your services to various AI models... Wait, why do you do this? Users go to ChatGPT and Claude to search for weather forecasts, so who do you show the advertisements in your software? For example, The Weather Channel or domestic Moji and Caiyun, the revenue mainly comes from advertising and subscription services. Behind these revenues is traffic. If all these traffic is "eat" by AI, the consequences can be imagined.

Observe, MCP is AI-centric, and the purpose of everything is to build a more powerful AI. GitHub is willing to do MCP services because it essentially only completes Git operations for you, and all the code still appears on GitHub; but Google or Bing may not have much motivation: do you want users to open your website, accidentally click on the promotion link, or use ChatGPT to call your search engine to filter and summarize?

In fact, domestic search engines have declined in China, and closed ecology is popular. Different users choose to search for content on Douyin, WeChat public accounts, and Xiaohongshu. By the same token, Will future AI applications take AI as the core and entrance, use all resources as the background board, or should each independent product try to integrate AI into its own functions? Who is going to hug whom? I think most services that pay attention to traffic and entrance may choose the latter for the time being.

So, although MCP seems to save workload for third-party developers, In fact, it is a "open conspiracy" that transfers pressure to third parties, and puts the big question of whether to open up MCP services to contribute to the AI ​​ecosystem in front of every product. The AI ​​field is so popular. When a concept is launched and everyone rushes to it, it doesn’t matter whether the concept itself is reasonable or not. What is important is that you cannot fall behind. Just like the crazy subsidies when takeout, taxi, etc. were first launched, the cost of embracing AI has been reduced now. Do you want it?

From MCP to Agent

Interestingly, we listed MCP above Server is compared to an app under the AI ​​ecosystem, and programmers who know how to learn about it have established many MCP Stores similar to the App Store. As we expected, there are mostly service rates for documents, databases, and cloud storage. This shows that MCP started to contribute to the "cats and horses" who wrote code, analyzed data, and designed drawings every day, and there is no new business model or product form.

The reason for this is that, in addition to the previous section, another reason is that AI is still relatively stable only when doing simple and repetitive work.

Speaking of this, we are not discussing MCP, but AI and Agent. The so-called Agent can be understood from two perspectives:

  • From the AI-centered perspective, Agent is an AI with plug-in, and takes action by calling external tools and resources.
  • From the traditional perspective, Agent is a software that uses natural language as the adhesive. If we regard external tools and resources themselves as subjects, then Agent is just a new service that uses the language model as the "interface".

Which perspective is chosen depends on which contribution of the software function and the language model is more important. For example:

  • If it is a weather forecast, then the model only acts as a sound cylinder, and the core is still the weather data and forecasting methods;
  • If it is code generation, then the programming ability of the model is the core, and the function that is automatically operated directly in the editor is only added value (so we see that artificial intelligence manufacturer OpenAI has acquired code editor manufacturer Windsurf, not the other way around).

Due to limited AI capabilities, the former may be more popular (so weather forecast is the favorite example of all people who popularize MCP...).

However, if one day, the AI ​​capabilities will be further improved, such as:

  • Multimodal capability enhances, and can conduct weather analysis in depth in weather data. The agent who acts as a forecaster is just around the corner;
  • Even, you can independently sort out data, develop and optimize forecast modes, and then the meteorological observatory of all employees can operate.

It is timely to this point, considering that the agent is still confined to the virtual world, it also needs human assistance: