Deep analysis of MCP: When AI protocols meet bad engineering practices

In-depth analysis of the MCP protocol reveals the shortcomings of AI giants in engineering practice.
Core content:
1. The importance of MCP protocol definition and standardized LLM interaction
2. The sudden popularity of MCP and its impact on the AI field
3. The complexity and problem analysis of the three transmission methods of MCP
Yesterday, when I was implementing a large model-based intelligent coding assistant for the team, I encountered a key problem: How to make LLM interact with our code base, database, and development tools? When I delved into the MCP (Model Context Protocol) led by Anthropic, I couldn't help but feel confused: Why are these AI giants who can invest billions of dollars in model training so rough in engineering practice?
What is MCP? Why is it suddenly so popular?
MCP is essentially an open protocol that standardizes how applications provide context to LLMs. In Anthropic's words, it is like a "USB-C interface" for AI applications, providing a standardized way for AI models to connect to different data sources and tools.
In the past month, MCP has suddenly become popular and has become the key technology that allows LLM to transform into an "intelligent agent" and interact with the world. At the same time, IBM launched the Agent Communication Protocol (ACP), and Google followed closely with the release of Agent2Agent (A2A). Major manufacturers are competing to lay out MCP, and server and client implementations of MCP are emerging every day, which can be found on websites such as mcp.so and pulsemcp.com .
MCP Core: Simple JSON-RPC Protocol
Essentially, MCP is a JSON-RPC protocol with predefined methods/endpoints designed to work with LLM. This part of the design is relatively simple and clear, but the real problem lies in its transport layer implementation.
Three transmission methods: each more complex than the other
MCP supports three main transmission methods:
stdio method: simple but limited
The stdio approach is straightforward: start a local MCP server, connect the stdout and stdin pipes, start sending JSON data, and use stderr for logging. Although this approach breaks the Unix/Linux pipe paradigm, it is simple, easy to understand, and works directly on all operating systems.
HTTP+SSE and "Streamable HTTP": A Case of Over-Design
HTTP transmission is a headache. In HTTP+SSE mode, to achieve full-duplex communication, the client first establishes an SSE session (such asGET /sse
) is used to read data. The first read will provide a URL to which the client sends a write request (such asPOST /a-endpoint?session-id=1234
.
Even more complicated is the so-called "Streamable HTTP" mode, which tries to improve HTTP+SSE but actually adds more complexity:
There are three ways to create a new session :
• Empty GET
ask• Empty POST
ask• Contains RPC calls POST
ask
There are four ways to open an SSE connection :
• Initialized GET
• Join an earlier session GET
• Initialize the session POST
• Contains requests and answers in SSE POST
Requests may be answered in three different ways :
• As a response to a RPC call POST
HTTP Response• In response POST
Events in SSE opened by RPC calls• As an event for any SSE opened earlier
This design complexity brings serious problems:
// Example: Handling different types of session creation requests
function handleRequest ( req ) {
if (req. method === 'GET' && !req. headers [ 'mcp-session-id' ]) {
// Create a new session and return the SSE stream
return createNewSession (req);
} else if (req. method === 'POST' && !req. headers [ 'mcp-session-id' ]) {
if (req. body && Object . keys (req. body ). length > 0 ) {
// Create a new session and handle the RPC call
return createSessionAndHandleRPC (req);
} else {
// Create an empty session
return createNewSession (req);
}
} else if (req. headers [ 'mcp-session-id' ]) {
// Process requests for existing sessions
return handleExistingSession (req);
}
}
Why not just use WebSockets?
Faced with this overly complex design, an obvious question is: why not just use WebSockets[1]?
Let's compare the pros and cons of different transmission methods:
WebSockets are a natural choice for implementing stdio-like behavior over HTTP:
1. stdio has environment variables, HTTP has HTTP headers 2. stdio has input and output streams similar to sockets, HTTP has WebSockets
Security risks of MCP
The design of "Streamable HTTP" introduces several security issues:
1. State management vulnerabilities : Managing session state across different connection types is complex and may lead to session hijacking, replay attacks, or DoS attacks 2. Increased attack surface : Multiple session creation and SSE connection entry points expand the attack surface 3. Obfuscation and hiding : Various session creation and response delivery methods may be used to hide malicious activities
Practical advice: How to use MCP correctly in your projects
If you decide to use MCP in your project, I recommend:
1. Use stdio transmission whenever possible : For local applications, this is the simplest and most reliable option 2. Consider custom transport layers : The MCP specification explicitly states that clients and servers can implement additional custom transport mechanisms to meet specific needs 3. Prioritize WebSockets : If HTTP transmission is required, consider using WebSockets instead of the official SSE or Streamable HTTP solutions 4. Simplify the authorization process : Choose the appropriate authorization method according to your needs, without being bound by the complex OAuth2 specification
Here is a simple example of implementing MCP using WebSockets:
// Simplified example of an MCP client using WebSockets
const ws = new WebSocket ( 'ws://mcp-server.example.com/mcp' );
ws.onopen = () = > {
//Send MCP request
ws.send ( JSON.stringify ( {
jsonrpc : '2.0' ,
id : '1' ,
method : 'file.read' ,
params : {
path : '/path/to/file.txt'
}
}));
};
ws.onmessage = ( event ) => {
const response = JSON . parse (event. data );
console . log ( 'MCP response received:' , response);
};
My Conclusion
As a full-stack architect, I think MCP represents a valuable attempt, but its implementation details expose the shortcomings of the AI industry in software engineering practices [1]. Instead of being bothered by overly complex transmission protocols, it is better to focus on the core value of MCP: providing a standardized way for AI models to interact with applications.
In real projects, we should optimize common use cases rather than edge cases. This means choosing a simpler, more mature technology stack rather than blindly following the latest complex design.
We hope that in the future MCP will be able to improve the transmission layer design and adopt solutions that are more in line with engineering best practices. Before that, we need to think critically and choose the technology path that best suits our project.