Talking about the MCP protocol, how to alleviate our anxiety about the "data security" of large models

How does the MCP protocol help ensure data security for large models and reduce corporate anxiety?
Core content:
1. Introduction to the MCP protocol and its role in data security
2. Transport layer security enhancement and session ID encryption management
3. Dynamic permission control, data desensitization and user authorization audit
I am often asked recently:
"Our customers want to use big models to analyze sales data, but they are afraid of data leakage. What should we do?" - This may be the most common soul-searching question faced by every pre-sales person.
Big models need to be "fed with data" to work, but what companies are most worried about is the risk of losing control after the data leaves the system. How to solve the anxiety of "data security" of big models?
The MCP protocol (Model Context Protocol) is like installing a "smart safety socket" for the big model, allowing data to be safely called and accurately controlled. Just on March 26, Anthropic released the latest MCP protocol, the last time being in November 2024.
Of course, it is just a protocol. Only when everyone abides by it can we achieve true data security. OpenAi also announced on March 27 that its Agents SDK officially supports MCP. (Ultraman: If you can't beat it, just join it)
The official SDKs in various languages are provided to support the implementation of MCP, including Python, TypeScript, Java, Kotlin, and C#.
Transport layer security enhancement
The core update of version 0326 is the Streamable HTTP transmission mechanism , which ensures data transmission security through the following design.
Session ID encryption management
Each communication session generates a uniqueMcp-Session-Id
, generated using an encryption algorithm (such as the national encryption SM4) to ensure the contextual consistency of cross-device and cross-network communications. The communication participants are distributed in different devices and networks. Through this Mcp-Session-Id, it can also ensure that the content and status of the communication can be correctly associated, and there will be no confusion or loss of information.
End-to-end data integrity assurance
AddedLast-Event-ID
Disconnection retransmission strategy: When network fluctuations cause disconnection, the client can reconnect with the breakpoint ID, and the server automatically resends the undelivered data. Experimental data shows that the data loss rate in the 5G edge computing scenario has dropped from 15% in the old version to 0.3%. All communications use HTTPS or customized encryption protocols (such as combined with national encryption algorithms) by default to prevent man-in-the-middle attacks.
Streaming batch optimization
A single HTTP POST request can trigger multiple batches of SSE responses. The data packets are encrypted using the national encryption SM4, which is suitable for high-frequency financial trading scenarios. The delay is compressed to less than 50ms, while avoiding the risk of plain text transmission. Of course, the old version of HTTP+SSE interface is still retained, and developers can migrate gradually to reduce enterprise transformation costs.
Dynamic permission control and data desensitization
Version 0326 introduces field-level access control and sandbox isolation technology to minimize data exposure.
Field-level dynamic desensitization
Addeddata_mask
Fields support dynamic desensitization of sensitive data (such as ID numbers in medical records). For example, the hospital MCP Server can be set to "only allow AI models to access diagnosis result fields", and the original data is always isolated in an encrypted sandbox.
Fine-grained resource isolation
Sensitive data is processed in an encrypted sandbox, and large models can only get desensitized results, just like a chef can only receive cut ingredients instead of entering the warehouse to look for raw materials.
Dynamic authorization
A temporary access token is generated for each data call, which is valid only for a single session. For example, when an enterprise calls ERP data, the token automatically expires after the operation is completed to prevent abuse of permissions.
Authorization is the cornerstone of data security.Based on OAuth 2.1 authorization and authentication
process.
User authorization and audit trail
Explicit User Authorization
As a core component, the MCP server can divide permissions for local and remote resources. For example, when connecting to a MySQL database, the MCP client must obtain user authorization (such as clicking “Allow”) before it can perform query operations to ensure the legitimacy of data access.
Operation log and audit trail
MCP Server can record tool call history (such as SQL query logs) to facilitate subsequent audits and abnormal behavior analysis. For example, users can view Claude Desktop query results and original SQL statements to verify the legitimacy of operations.
All data call records must be mandatory on-chain to support regulatory authorities to quickly trace abnormal behavior. For example, an illegal access can be located to the specific MCP Server and caller IP through on-chain records.
Decentralized ecological architecture design
The most revolutionary design of MCP is to use a decentralized ecosystem to solve the contradiction between "sharing and security".
MCP supports distributed deployment, allowing anyone to host their own MCP Server and register to an open network (such as OpenMCP.Network). This design avoids a single vendor monopolizing the data or tool ecosystem and reduces systemic risks.
Developer Economy
Enterprises can build their own MCP Server to encapsulate core data capabilities. For example, a retail giant encapsulated the "inventory forecasting algorithm" into a service. When the external model is called, only the forecast results can be obtained, and the original sales data cannot be touched;
Smart Contract Incentives
Combined with blockchain technology, the number of data calls is recorded and the benefits are automatically settled. For example, when a hospital shares medical statistics, it can get a share without uploading the patient's original medical records.
Compliance adaptation and localization transformation
Although MCP is gaining popularity globally, it faces unique challenges in China.
Data not leaving the country mandatory policy
All cross-border communications must be transferred through domestic servers. Alibaba Cloud's "MCP China Node" already supports local storage of financial and government data to avoid triggering compliance risks under the Data Security Law.
Domestic interface transformation
The protocol is compatible with domestic technical standards. For example, Baidu Map MCP service is adapted to GB/T 35648-2017 geographic information standard and is deeply integrated with the Beidou positioning system.
Dynamic compliance review
Developers need to declare the legitimacy of the data source and scan the interface through automated tools to ensure compliance with the requirements of the Personal Information Protection Law. For example, a privacy impact assessment (PIA) must be completed before calling user behavior data.
Safety Ecosystem Governance
It can be expected that more and more developers will flock to the MCP development track. How to do a good job in MCP market security certification is a very important part of this ecosystem.
The author believes that the most important thing is to strictly control the MCP listing process.
Third-party MCP Servers must pass “Trusted Service” certification, including code security scanning, penetration testing, and compliance review, before they can be put on the shelves.
Summarize
Only through the five-layer protection system of "transmission encryption-authority control-compliance adaptation-audit tracking-ecological governance" can the security risks in the delivery process of large models be solved.
The essence of MCP is to release the value of data within a "safe fence". It is neither an iron cage that hinders innovation nor a loophole that allows things to run wild. Instead, like the USB-C interface, it uses standardized protocols to achieve a balance between security and efficiency.