Technical Detailed Explanation: In-depth Analysis of the Working Principle of MCP, with Code Implementation (Part 2)

Deeply understand the working principle of MCP and improve the capabilities of AI agents.
Core content:
1. Technical in-depth analysis of the working principle of MCP
2. Code implementation: the practice of multi-server system processing different types of work
3. Reasons for avoiding large models directly accessing APIs and the advantages of MCP
Today, this article will analyze the working principle of MCP from a technical perspective, and how the MCP multi-server system handles different types of work through different transmission protocols through code.
01 How MCP works: A technical in-depth analysis
The working principle of MCP mainly focuses on three aspects: dynamic context window management, context embedding and compression, and stateful workflow.
1. Context Window Management: MCP uses a dynamic context window that expands with each interaction and stores the following information:
User preferences (e.g., language, tone).
Conversation history (previous queries/responses).
Environmental data (e.g. device type, geographic location).
2. Contextual embedding and compression: To avoid data overload, MCP compresses non-critical data into an embedded form (e.g. summarizing a chat history of 10 messages into one intent vector) while retaining critical details.
3. Stateful Workflows: MCP supports multi-step workflows, where agents can:
Remember past actions (e.g. "user uploaded their ID card").
Adjust policies (e.g. switch from email to SMS if the user is offline).
Self-correct based on feedback (e.g. “Users don’t like option A; prioritize option B”).
The following code defines a travel agent class TravelAgent, which can book flights for users based on their historical interaction information and preferences. If the user prefers economy class, the budget factor will be taken into account when searching for flights; otherwise, the flight search will be normal.
# Hypothetical MCP stateful workflow (from official docs)
class TravelAgent(MCPAgent) :
def __init__(self, user_id) :
self.context = load_context(user_id)# Load past interactions
self.preferences = self.context.get( "preferences" , {})
def book_flight(self, query):
if self.preferences.get( "class" ) == "economy" :
return search_flights(query, budget=True)
else :
return search_flights(query)
02 Why not let the big model access the application API directly?
A common question about the Model Context Protocol (MCP) is: “Why do we need a custom protocol? Can’t large language models learn to use the API themselves?”
In theory, maybe. Most public APIs come with documentation that explains what they do. One could feed that documentation into a big language model and have it infer the steps needed to achieve a goal.
However, in practical applications, this approach is often inefficient. As developers committed to improving user experience, speed and responsiveness must be prioritized.
By providing tools in a format that is easy to use with large language models, we were able to significantly reduce latency and streamline the entire process. This approach ensures a smoother interactive experience and faster results for users.
03 MCP vs Traditional API Calls
Why does MCP appear?
Traditional application programming interfaces (APIs) are like snapshots, suitable for static tasks, while the Model Context Protocol (MCP) is like a video, fully capturing the context of user intent.
A question I often get about MCP is “How is it different from a tool call?”
Tool calling refers to the mechanism by which the large language model (LLM) calls functions to perform tasks in the real world. In this mechanism, the large language model works with a tool executor, which calls the specified tool and returns the result. The typical process is as follows:
Describes the tool to be called
Send results
Large Language Model
Tool Actuator
However, this interaction usually occurs within the same environment, whether on a single server or within a specific desktop application.
In contrast, the MCP framework allows large language models to access tools from a separate process, which can be either local or hosted on a remote server. Its structure is as follows:
MCP Server
MCP Client
Describes the tool to be called
Calling tools
Send results
Return results
Large Language Model
MCP Protocol
Tool Actuator
The key difference is the complete decoupling of the MCP server from the client. This separation provides greater flexibility and scalability, optimizing how large language models interact with external tools.
04 Why MCP is revolutionary for intelligent agents
The intelligent agent framework requires AI to not only respond but also act autonomously. The importance of MCP is reflected in the following aspects:
1. Achieving true autonomy, today’s agents are able to:
Make decisions based on historical data (e.g., a medical agent can call up a patient’s list of allergies).
Chain tasks without human intervention (e.g., completing the research→draft→edit→publish process for a blog post).
2. Collaborative Intelligence, MCP allows agents to share contextual information with:
Other agents (e.g., a customer service bot that forwards inquiries to a human agent).
External tools (e.g., incorporating real-time stock data into financial advisor responses).
3. Ethical standards protection
Auditability: Full contextual history helps track down biased or inaccurate outputs.
Privacy protection: Sensitive data (such as medical records) is stored separately.
Without MCP, the agent would lack continuity, like a chef forgetting a step in a recipe midway through cooking.
4. Achieving long-term autonomy
Persistent Memory: Agents can remember user preferences
Goal concatenation: Perform a multi-step task (e.g., “research → negotiate → book business travel”).
05 Connection Lifecycle in MCP
The connection lifecycle in MCP is critical to managing the states and transitions of interactions between clients and servers, ensuring stable communication and proper functionality throughout the process.
This structured component handling in MCP provides a clear framework for efficient communication and integration, helping large model (LLM) applications to flourish.
1. Initialization: During the initialization phase, the following steps are performed between the server and the client:
The client sends an initialization request containing the protocol version and its capabilities.
The server replies with its own protocol version and capabilities.
The client sends an initialization notification to confirm that the connection is successfully established.
At this point the connection is ready and normal message exchange can begin.
2. Message Exchange: After the initialization phase, MCP supports the following communication modes:
Request-Response: Either the client or the server can send a request, and the other side will respond.
Notifications: Either party can send a one-way message without waiting for a reply.
3. Termination: Either party can terminate the connection in the following situations:
Normal closure: achieved through the method.close() method.
Transport disconnect: occurs when the communication channel is lost.
Error conditions: Encountering an error may also cause the connection to terminate.
Error handling
MCP defines a set of standard error codes to efficiently handle problems that may arise.
The following code defines an enumeration type ErrorCode, which is a set of standard JSON-RPC error codes. When encountering corresponding problems in the MCP system, these error codes can be used to accurately represent the error type, thereby facilitating error handling and debugging.
enum ErrorCode {// Standard JSON-RPC error codesParseError = -32700,InvalidRequest = -32600,MethodNotFound = -32601,InvalidParams = -32602,InternalError = -32603}
Additionally, the SDK and applications can define their own custom error codes, starting from -32000.
Error Propagation:
Errors are communicated via:
Error response: returned in response to a problematic request.
Error Events: Triggered on transports to notify of errors.
Protocol-level error handler: manages errors at the MCP level.
This structured lifecycle ensures reliable and efficient communication between client and server while gracefully handling errors when they occur.
06 Code Implementation
Let's take a look at a representative case that demonstrates a complex multi-server setup that handles different types of work through different transport protocols.
Here is a visual representation of the system:
Install dependencies: Install the required dependency packages through the pip install mcp httpx langchain langchain-core langchai-community langchain-groq langchain-ollama langchain_mcp_adapters command, and set the groq api key in the .env file.
1. Install required dependencies
pip install mcp httpx langchain langchain-core langchai-community langchain-groq langchain-ollama langchain_mcp_adapters
2. Set the groq api key in the .env file
import osfrom dotenv import load_dotenvload_dotenv()
3. Create a server
3.1 Math server (math_server.py): Use the FastMCP class in the mcp.server.fastmcp module to create a server named "Math", and define two utility functions add and multiply for addition and multiplication of two integers respectively. Finally, run the server through mcp.run(transport="stdio")
# math_server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP( "Math" )
@mcp.tool()
def add(a: int , b: int ) -> int :
"""Add two numbers"""
return a + b
@mcp.tool()
def multiply(a: int , b: int ) -> int :
"""Multiply two numbers"""
return a * b
if __name__ == "__main__" :
mcp.run(transport= "stdio" )
3.2 Weather Server (weather.py): Create a server named "weather" based on the FastMCP class, set the base URL and user agent of the National Weather Service (NWS) API. Define the asynchronous function make_nws_request to send requests to the NWS API and handle errors, and the format_ale rt function to format the alert features into a readable string. Also define two tool functions get_alerts and get_forecast , which are used to obtain weather alerts for a specified US state and weather forecasts for a specified latitude and longitude location, respectively. Finally, run the server through mcp.run(transport='sse') .
from typing import Anyimport httpxfrom mcp.server.fastmcp import FastMCP# Initialize FastMCP servermcp = FastMCP("weather")# ConstantsNWS_API_BASE = "https://api.weather.gov"USER_AGENT = "weather-app/1.0"async def make_nws_request(url: str) -> dict[str, Any] | None: """Make a request to the NWS API with proper error handling.""" headers = { "User-Agent": USER_AGENT, "Accept": "application/geo+json" } async with httpx.AsyncClient() as client: try: response = await client.get(url, headers=headers, timeout=30.0) response.raise_for_status() return response.json() except Exception: return Nonedef format_alert(feature: dict) -> str: """Format an alert feature into a readable string.""" props = feature["properties"] return f"""Event: {props.get('event', 'Unknown')}Area: {props.get('areaDesc', 'Unknown')}Severity: {props.get('severity', 'Unknown')}Description: {props.get('description', 'No description available')}Instructions: {props.get('instruction', 'No specific instructions provided')}"""@mcp.tool()async def get_alerts(state: str) -> str: """Get weather alerts for a US state. Args: state: Two-letter US state code (eg CA, NY) """ url = f"{NWS_API_BASE}/alerts/active/area/{state}" data = await make_nws_request(url) if not data or "features" not in data: return "Unable to fetch alerts or no alerts found." if not data["features"]: return "No active alerts for this state." alerts = [format_alert(feature) for feature in data["features"]] return "\n---\n".join(alerts)@mcp.tool()async def get_forecast(latitude: float, longitude: float) -> str: """Get weather forecast for a location. Args: latitude: Latitude of the location longitude: Longitude of the location """ # First get the forecast grid endpoint points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}" points_data = await make_nws_request(points_url) if not points_data: return "Unable to fetch forecast data for this location." # Get the forecast URL from the points response forecast_url = points_data["properties"]["forecast"] forecast_data = await make_nws_request(forecast_url) if not forecast_data: return "Unable to fetch detailed forecast." # Format the periods into a readable forecast periods = forecast_data["properties"]["periods"] forecasts = [] for period in periods[:5]:# Only show next 5 periods forecast = f"""{period['name']}:Temperature: {period['temperature']}°{period['temperatureUnit']}Wind: {period['windSpeed']} {period['windDirection']}Forecast: {period['detailedForecast']}""" forecasts.append(forecast) return "\n---\n".join(forecasts)if __name__ == "__main__": # Initialize and run the server mcp.run(transport='sse')
4. Create the client (langchain_mcp_multiserver.py):
Load environment variables and initialize the ChatOllama model. Define server parameters server_params and create a MultiServerMCPClient instance with connection information for the "weather" and "math" servers. Define an asynchronous function run_app, use create_react_agent to create an agent, call agent.ainvoke to handle user questions and get responses, and finally print the response content. Set the user question in the if __name__ == "__main__" code block and run the run_app function.
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from IPython.display import display, Markdown
from langchain_core.messages import HumanMessage, ToolMessage, AIMessage
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
#from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama
from langchain_core. prompts import PromptTemplate
from dotenv import load_dotenv
load_dotenv()
#model = ChatGroq(model= ,temperature=0.5)
model = ChatOllama(model= "llama3.2:1b" ,temperature= 0.0 ,max_new_tokens= 500 )
server_params = StdioServerParameters(
command = "python" ,
# Make sure to update to the full absolute path to your math_server.py file
args=[ "weather.py" ],
)
async def run_app(user_question):
async with MultiServerMCPClient(
{
"weather" : {
"url" : "http://localhost:8000/sse" ,
"transport" : "sse" ,
},
"math" : {
"command" : "python" ,
# Make sure to update to the full absolute path to your math_server.py file
"args" : [ "math_server.py" ],
"transport" : "stdio" ,
},
}
) as client:
agent = create_react_agent(model, client.get_tools())
agent_response = await agent.ainvoke({ "messages" : user_question})
print(agent_response[ 'messages' ][ -1 ].content)
# # Stream the response chunks
# # Extract the message content from the AddableUpdatesDict structure
# # Handle different content formats
# # For structured content with text and tool use
# # For simple text content
# # Display tool response
return agent_response[ 'messages' ][ -1 ].content
if __name__ == "__main__" :
#user_question = "what is the weather in california?"
#user_question = "what's (3 + 5) x 12?"
#user_question = "what's the weather in seattle?"
user_question = "what's the weather in NYC?"
response = asyncio.run(run_app(user_question=user_question))
print(response)
Before calling the client, make sure the weather server is up and running.
python weather.py
Calling the client
python langchain_mcp_multiserver.py
Response: "What is (3 + 5) x 12?"
The result of (3 + 5) is 8, and 8 x 12 is 96.
Response: "What's the weather like in New York City?"
It appears you u've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.
Here' sa breakdown of what each alert is saying:
Flooding Alerts
* The NWS has issued several flood watches across New York State, including:
+ Northern St. Lawrence; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange;
+ Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham
* The NWS has also issued a flood watch for parts of Vermont, including:
+ Northern New York and northern and central Vermont
Ice Jam Alerts
* The NWS has warned about the possibility of ice jams in several areas, including:
+ Bennington; Western Windham; Eastern Windham
+ Southern Vermont, Bennington and Windham Counties
+ Central New York, Herkimer County
+ Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties
**Other Alerts**
* The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.
* There are also alerts for isolated ice jams that could further increase the flood risk.
It ' s essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.
It appears you've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.
Here's a breakdown of what each alert is saying:
Flooding Alerts
* The NWS has issued several flood watches across New York State, including:
+ Northern St. Lawrence ; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange;
+ Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham
* The NWS has also issued a flood watch for parts of Vermont, including:
+ Northern New York and northern and central Vermont
Ice Jam Alerts
* The NWS has warned about the possibility of ice jams in several areas, including:
+ Bennington; Western Windham; Eastern Windham
+ Southern Vermont, Bennington and Windham Counties
+ Central New York, Herkimer County
+ Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties
**Other Alerts**
* The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.
* There are also alerts for isolated ice jams that could further increase the flood risk.
It ' s essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.
The MCP client is then able to connect to the appropriate server based on the issue.
If there are any errors in the analysis, please feel free to give me your advice. Friends who are interested in MCP are also welcome to contact me privately to join the MCP&Agent exchange group (note MCP).
Last but not least
The implementation of MCP above demonstrates its powerful, scalable and flexible architecture for building complex AI applications. This modular design and support for multiple transmission protocols enable AI agents to have better choices, especially for multimodal AI applications, complex workflow orchestration, distributed AI systems, real-time data processing applications, etc.
For intelligent agents, proactive actions are what matter most, not reactive actions.