Technical Detailed Explanation: In-depth Analysis of the Working Principle of MCP, with Code Implementation (Part 2)

Written by
Silas Grey
Updated on:July-11th-2025
Recommendation

Deeply understand the working principle of MCP and improve the capabilities of AI agents.

Core content:
1. Technical in-depth analysis of the working principle of MCP
2. Code implementation: the practice of multi-server system processing different types of work
3. Reasons for avoiding large models directly accessing APIs and the advantages of MCP

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)


Today, this article will analyze the working principle of MCP from a technical perspective, and how the MCP multi-server system handles different types of work through different transmission protocols through code.


01 How MCP works: A technical in-depth analysis


The working principle of MCP mainly focuses on three aspects: dynamic context window management, context embedding and compression, and stateful workflow.


1. Context Window Management: MCP uses a dynamic context window that expands with each interaction and stores the following information:


  1. User preferences (e.g., language, tone).

  2. Conversation history (previous queries/responses).

  3. Environmental data (e.g. device type, geographic location).


2. Contextual embedding and compression: To avoid data overload, MCP compresses non-critical data into an embedded form (e.g. summarizing a chat history of 10 messages into one intent vector) while retaining critical details.


3. Stateful Workflows: MCP supports multi-step workflows, where agents can:


  1. Remember past actions (e.g. "user uploaded their ID card").

  2. Adjust policies (e.g. switch from email to SMS if the user is offline).

  3. Self-correct based on feedback (e.g. “Users don’t like option A; prioritize option B”).


The following code defines a travel agent class TravelAgent, which can book flights for users based on their historical interaction information and preferences. If the user prefers economy class, the budget factor will be taken into account when searching for flights; otherwise, the flight search will be normal.


# Hypothetical MCP stateful workflow  (from official docs)class TravelAgent(MCPAgent) :def __init__(self, user_id) :self.context = load_context(user_id)# Load past interactionsself.preferences = self.context.get( "preferences" , {})
def book_flight(self, query):if self.preferences.get( "class" ) == "economy" :return search_flights(query, budget=True)else :return search_flights(query)


02 Why not let the big model access the application API directly?



A common question about the Model Context Protocol (MCP) is: “Why do we need a custom protocol? Can’t large language models learn to use the API themselves?”


In theory, maybe. Most public APIs come with documentation that explains what they do. One could feed that documentation into a big language model and have it infer the steps needed to achieve a goal.


However, in practical applications, this approach is often inefficient. As developers committed to improving user experience, speed and responsiveness must be prioritized.


By providing tools in a format that is easy to use with large language models, we were able to significantly reduce latency and streamline the entire process. This approach ensures a smoother interactive experience and faster results for users.


03 MCP vs Traditional API Calls



Why does MCP appear?


Traditional application programming interfaces (APIs) are like snapshots, suitable for static tasks, while the Model Context Protocol (MCP) is like a video, fully capturing the context of user intent.


A question I often get about MCP is “How is it different from a tool call?”


Tool calling refers to the mechanism by which the large language model (LLM) calls functions to perform tasks in the real world. In this mechanism, the large language model works with a tool executor, which calls the specified tool and returns the result. The typical process is as follows:


  1. Describes the tool to be called

  2. Send results

  3. Large Language Model

  4. Tool Actuator


However, this interaction usually occurs within the same environment, whether on a single server or within a specific desktop application.


In contrast, the MCP framework allows large language models to access tools from a separate process, which can be either local or hosted on a remote server. Its structure is as follows:


  1. MCP Server

  2. MCP Client

  3. Describes the tool to be called

  4. Calling tools

  5. Send results

  6. Return results

  7. Large Language Model

  8. MCP Protocol

  9. Tool Actuator



The key difference is the complete decoupling of the MCP server from the client. This separation provides greater flexibility and scalability, optimizing how large language models interact with external tools.


04 Why MCP is revolutionary for intelligent agents


The intelligent agent framework requires AI to not only respond but also act autonomously. The importance of MCP is reflected in the following aspects:


1. Achieving true autonomy, today’s agents are able to:


  1. Make decisions based on historical data (e.g., a medical agent can call up a patient’s list of allergies).

  2. Chain tasks without human intervention (e.g., completing the research→draft→edit→publish process for a blog post).


2. Collaborative Intelligence, MCP allows agents to share contextual information with:


  1. Other agents (e.g., a customer service bot that forwards inquiries to a human agent).

  2. External tools (e.g., incorporating real-time stock data into financial advisor responses).


3. Ethical standards protection


  1. Auditability: Full contextual history helps track down biased or inaccurate outputs.

  2. Privacy protection: Sensitive data (such as medical records) is stored separately.


Without MCP, the agent would lack continuity, like a chef forgetting a step in a recipe midway through cooking.


4. Achieving long-term autonomy


  1. Persistent Memory: Agents can remember user preferences

  2. Goal concatenation: Perform a multi-step task (e.g., “research → negotiate → book business travel”).


05 Connection Lifecycle in MCP


The connection lifecycle in MCP is critical to managing the states and transitions of interactions between clients and servers, ensuring stable communication and proper functionality throughout the process.


This structured component handling in MCP provides a clear framework for efficient communication and integration, helping large model (LLM) applications to flourish.



1. Initialization: During the initialization phase, the following steps are performed between the server and the client:


  1. The client sends an initialization request containing the protocol version and its capabilities.

  2. The server replies with its own protocol version and capabilities.

  3. The client sends an initialization notification to confirm that the connection is successfully established.

  4. At this point the connection is ready and normal message exchange can begin.


2. Message Exchange: After the initialization phase, MCP supports the following communication modes:


  1. Request-Response: Either the client or the server can send a request, and the other side will respond.

  2. Notifications: Either party can send a one-way message without waiting for a reply.


3. Termination: Either party can terminate the connection in the following situations:


  1. Normal closure: achieved through the method.close() method.

  2. Transport disconnect: occurs when the communication channel is lost.

  3. Error conditions: Encountering an error may also cause the connection to terminate.


Error handling


MCP defines a set of standard error codes to efficiently handle problems that may arise.


The following code defines an enumeration type ErrorCode, which is a set of standard JSON-RPC error codes. When encountering corresponding problems in the MCP system, these error codes can be used to accurately represent the error type, thereby facilitating error handling and debugging.


enum ErrorCode {// Standard JSON-RPC error codesParseError = -32700,InvalidRequest = -32600,MethodNotFound = -32601,InvalidParams = -32602,InternalError = -32603}


Additionally, the SDK and applications can define their own custom error codes, starting from -32000.


Error Propagation:


Errors are communicated via:


  1. Error response: returned in response to a problematic request.

  2. Error Events: Triggered on transports to notify of errors.

  3. Protocol-level error handler: manages errors at the MCP level.


This structured lifecycle ensures reliable and efficient communication between client and server while gracefully handling errors when they occur.


06 Code Implementation


Let's take a look at a representative case that demonstrates a complex multi-server setup that handles different types of work through different transport protocols.


Here is a visual representation of the system:



Install dependencies: Install the required dependency packages through  the pip install mcp httpx langchain langchain-core langchai-community langchain-groq langchain-ollama langchain_mcp_adapters  command, and set the groq api key in the .env file.


1. Install required dependencies


pip install mcp httpx langchain langchain-core langchai-community langchain-groq langchain-ollama langchain_mcp_adapters


2. Set the groq api key in the .env file


import osfrom dotenv import load_dotenvload_dotenv()


3. Create a server


3.1 Math server (math_server.py): Use   the FastMCP class in the  mcp.server.fastmcp module to create a server named "Math", and define two utility functions add and multiply for addition and multiplication of two integers respectively. Finally,  run the server through mcp.run(transport="stdio")


# math_server.pyfrom mcp.server.fastmcp import FastMCP
mcp = FastMCP( "Math" )
@mcp.tool()def add(a: int , b: int ) -> int :    """Add two numbers"""    return a + b
@mcp.tool()def multiply(a: int , b: int ) -> int :    """Multiply two numbers"""    return a * b
if __name__ == "__main__" :    mcp.run(transport= "stdio" )


3.2 Weather Server (weather.py): Create a server named "weather" based on the FastMCP class, set the base URL and user agent of the National Weather Service (NWS) API. Define the asynchronous function  make_nws_request  to send requests to the NWS API and handle errors, and the format_ale rt  function to format the alert features into a readable string. Also define two tool functions  get_alerts  and  get_forecast , which are used to obtain weather alerts for a specified US state and weather forecasts for a specified latitude and longitude location, respectively. Finally,   run the server through mcp.run(transport='sse') .


from typing import Anyimport httpxfrom mcp.server.fastmcp import FastMCP# Initialize FastMCP servermcp = FastMCP("weather")# ConstantsNWS_API_BASE = "https://api.weather.gov"USER_AGENT = "weather-app/1.0"async def make_nws_request(url: str) -> dict[str, Any] | None: """Make a request to the NWS API with proper error handling.""" headers = { "User-Agent": USER_AGENT, "Accept": "application/geo+json" } async with httpx.AsyncClient() as client: try: response = await client.get(url, headers=headers, timeout=30.0) response.raise_for_status() return response.json() except Exception: return Nonedef format_alert(feature: dict) -> str: """Format an alert feature into a readable string.""" props = feature["properties"] return f"""Event: {props.get('event', 'Unknown')}Area: {props.get('areaDesc', 'Unknown')}Severity: {props.get('severity', 'Unknown')}Description: {props.get('description', 'No description available')}Instructions: {props.get('instruction', 'No specific instructions provided')}"""@mcp.tool()async def get_alerts(state: str) -> str: """Get weather alerts for a US state. Args: state: Two-letter US state code (eg CA, NY) """ url = f"{NWS_API_BASE}/alerts/active/area/{state}" data = await make_nws_request(url) if not data or "features" not in data: return "Unable to fetch alerts or no alerts found." if not data["features"]: return "No active alerts for this state." alerts = [format_alert(feature) for feature in data["features"]] return "\n---\n".join(alerts)@mcp.tool()async def get_forecast(latitude: float, longitude: float) -> str: """Get weather forecast for a location. Args: latitude: Latitude of the location longitude: Longitude of the location """ # First get the forecast grid endpoint points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}" points_data = await make_nws_request(points_url) if not points_data: return "Unable to fetch forecast data for this location." # Get the forecast URL from the points response forecast_url = points_data["properties"]["forecast"] forecast_data = await make_nws_request(forecast_url) if not forecast_data: return "Unable to fetch detailed forecast." # Format the periods into a readable forecast periods = forecast_data["properties"]["periods"] forecasts = [] for period in periods[:5]:# Only show next 5 periods forecast = f"""{period['name']}:Temperature: {period['temperature']}°{period['temperatureUnit']}Wind: {period['windSpeed']} {period['windDirection']}Forecast: {period['detailedForecast']}""" forecasts.append(forecast) return "\n---\n".join(forecasts)if __name__ == "__main__": # Initialize and run the server mcp.run(transport='sse')


4. Create the client (langchain_mcp_multiserver.py):


Load environment variables and initialize the ChatOllama model. Define server parameters  server_params and create a MultiServerMCPClient instance with connection information for the "weather" and "math" servers. Define an asynchronous function run_app, use  create_react_agent  to create an agent, call  agent.ainvoke  to handle user questions and get responses, and finally print the response content.  Set the user question in the if __name__ == "__main__"  code block and run  the run_app  function.


import asynciofrom mcp import ClientSession, StdioServerParametersfrom mcp.client.stdio import stdio_clientfrom IPython.display import display, Markdownfrom langchain_core.messages import HumanMessage, ToolMessage, AIMessagefrom langchain_mcp_adapters.tools import load_mcp_toolsfrom langgraph.prebuilt import create_react_agentfrom langchain_mcp_adapters.client import MultiServerMCPClientfrom langgraph.prebuilt import create_react_agent#from langchain_azure_ai.chat_models import AzureAIChatCompletionsModelfrom langchain_groq import ChatGroqfrom langchain_ollama import ChatOllamafrom langchain_core. prompts import PromptTemplatefrom dotenv import load_dotenv
load_dotenv()
#model = ChatGroq(model= "llama-3.3-70b-versatile" ,temperature=0.5)model = ChatOllama(model= "llama3.2:1b" ,temperature= 0.0 ,max_new_tokens= 500 )server_params = StdioServerParameters(    command = "python" ,    # Make sure to update to the full absolute path to your math_server.py file    args=[ "weather.py" ],)async def run_app(user_question):            async with MultiServerMCPClient(        {            "weather" : {                "url" : "http://localhost:8000/sse" ,                "transport" : "sse" ,            },            "math" : {            "command" : "python" ,            # Make sure to update to the full absolute path to your math_server.py file            "args" : [ "math_server.py" ],            "transport" : "stdio" ,            },        }    ) as client:        agent = create_react_agent(model, client.get_tools())        agent_response = await agent.ainvoke({ "messages" : user_question})        print(agent_response[ 'messages' ][ -1 ].content)        # # Stream the response chunks        # async for chunk in agent.astream({ "messages" : user_question}):        # # Extract the message content from the AddableUpdatesDict structure        #     if 'agent' in chunk and 'messages' in chunk[ 'agent' ]:        # for message in chunk[ 'agent' ][ 'messages' ]:        #             if isinstance(message, AIMessage):        # # Handle different content formats        #                 if isinstance(message.content, list):        # # For structured content with text and tool use        # for item in message.content:        #                         if isinstance(item, dict) and 'text' in item:        # print(f "**AI**: {item['text']}" )        #                 else :        # # For simple text content        # print(f "**AI**: {message.content}" )                                    #     elif 'tools' in chunk and 'messages' in chunk[ 'tools' ]:        # for message in chunk[ 'tools' ][ 'messages' ]:        #             if hasattr(message, 'name' ) and hasattr(message, 'content' ):        # # Display tool response        # print(f "**Tool ({message.name})**: {message.content}" )        return agent_response[ 'messages' ][ -1 ].contentif __name__ == "__main__" :    #user_question = "what is the weather in california?"    #user_question = "what's (3 + 5) x 12?"    #user_question = "what's the weather in seattle?"    user_question = "what's the weather in NYC?"    response = asyncio.run(run_app(user_question=user_question))    print(response)


Before calling the client, make sure the weather server is up and running.


python weather.py


Calling the client


python langchain_mcp_multiserver.py


Response: "What is (3 + 5) x 12?"


The result of (3 + 5) is 8, and 8 x 12 is 96.


 Response: "What's the weather like in New York City?"


It appears you u've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.
Here' sa breakdown of what each alert is saying:Flooding Alerts* The NWS has issued several flood watches across New York State, including:        + Northern St. Lawrence; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange;        + Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham* The NWS has also issued a flood watch for parts of Vermont, including:        + Northern New York and northern and central VermontIce Jam Alerts* The NWS has warned about the possibility of ice jams in several areas, including:        + Bennington; Western Windham; Eastern Windham        + Southern Vermont, Bennington and Windham Counties        + Central New York, Herkimer County        + Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties**Other Alerts*** The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.* There are also alerts for isolated ice jams that could further increase the flood risk.It ' s essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.It appears you've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.Here's a breakdown of what each alert is saying:Flooding Alerts* The NWS has issued several flood watches across New York State, including:        + Northern St. Lawrence ; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange;        + Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham* The NWS has also issued a flood watch for parts of Vermont, including:        + Northern New York and northern and central VermontIce Jam Alerts* The NWS has warned about the possibility of ice jams in several areas, including:        + Bennington; Western Windham; Eastern Windham        + Southern Vermont, Bennington and Windham Counties        + Central New York, Herkimer County        + Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties**Other Alerts*** The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.* There are also alerts for isolated ice jams that could further increase the flood risk.
It ' s essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.


The MCP client is then able to connect to the appropriate server based on the issue.


If there are any errors in the analysis, please feel free to give me your advice. Friends who are interested in MCP are also welcome to contact me privately to join the MCP&Agent exchange group (note MCP).


Last but not least


The implementation of MCP above demonstrates its powerful, scalable and flexible architecture for building complex AI applications. This modular design and support for multiple transmission protocols enable AI agents to have better choices, especially for multimodal AI applications, complex workflow orchestration, distributed AI systems, real-time data processing applications, etc.


For intelligent agents, proactive actions are what matter most, not reactive actions.