What risk control issues should enterprises pay attention to when accessing large models such as DeepSeek?
Updated on:July-15th-2025
Recommendation
When enterprises access large models such as DeepSeek, they must pay attention to the risk prevention and control guidelines.
Core content:
1. The motivation and methods of enterprise access behind the DeepSeek craze
2. Analysis of risk control points under different access methods
3. Key considerations for enterprise data confidentiality and business security
Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)
The DeepSeek craze has continued from the Spring Festival to now, wave after wave. At first, it was the technology media and self-media that discussed DeepSeek, and now many companies have joined the trend and announced that they have connected to DeepSeek. Even the national application WeChat, as well as competitors Wenxin Yiyan and Tencent Yuanbao, have announced that they have connected DeepSeek to their own large model software.The development of AI is unstoppable, and the AI revolution is in full swing. Many companies have the desire and need to access big model products to improve internal efficiency or product experience. However, the use of big models by companies is different from that of individuals. They must consider many issues such as commercial confidentiality and business security.In this article, we will thoroughly clarify what risk control issues enterprises should pay attention to when accessing large models. This is not a technical article, but more from the perspective of business and operation, using relatively popular language to identify AI risks that enterprises cannot ignore in the actual implementation of AI.What potential risks will you face when accessing a large model?Before studying the risk control issues of big models, we must first clarify the ways in which enterprises access big models.I have written about this in the article "What are the methods for accessing big models in business? " Whether it is DeepSeek or other big models, there are generally five ways for enterprises to access them. This table summarizes the advantages and disadvantages of each method.In different access methods, due to different deployment methods and usage methods, the risk control points included are also different.Individuals directly use the platform functions and build agents through the intelligent platform. Since it is a public platform, the data confidentiality level is the lowest. As a copilot (co-pilot, assistant) for personal work, there are not many problems, but if you want to pass on the company's confidential information, it is easy to cause information leakage.API calls, private local deployment, and indirect deployment through cloud service providers. These three methods involve the adjustment and training of large models on the one hand, and the disclosure of model output content on the other hand, so they involve more risks, such as training data compliance, permission control, and output content review.We will elaborate on the specific risk types and treatment methods below.How to conduct enterprise risk control for large AI modelsBased on the timing of interactions between AI users and AI big models, we divide the risks of enterprises accessing big models into three stages: risk control when inputting into the model, risk control during model processing, and risk control when the model is output.Risk control when passing in modelsThis stage is the entry stage for information to enter the big model. There will be no problem with daily AI question and answer. The biggest risk is the leakage of business sensitive information and data compliance.Risk point 1: Data security risk (confidential data, private data).In actual work, due to business needs, real business data or important business secrets will be sent to AI. In this process, it is easy for the company's core secrets to leak out.For example, the provider of AI capabilities (not necessarily a large model R&D company, but also a third party providing AI services, etc.) can view the content of the user's conversation with AI; or after receiving the information, AI internalizes it into its own database and leaks the information when talking to another user; or in a "pseudo-localized" deployment solution, if the interface encryption is not perfect, attackers can steal business data in transmission through reverse engineering.The solution lies in paying attention to data security, mainly two points:1. For confidential data and private data, we should do a good job of permission control in the underlying table structure. The basic principle is the "principle of least privilege", which is to limit the scope of data access and confidentiality level. For example, for some sensitive data, only people with certain permissions can add, delete, check and modify it, and others cannot access it. We should do a good job of checking from the source of data acquisition.2. Strengthen the confidentiality awareness of company members and desensitize sensitive data when using AI. If certain data must be used during the operation of AI, the data should be desensitized. For example, when the AI model calls the user data table, the user's ID number and mobile phone number should be fuzzy in advance.Risk point 2: Training data compliance riskIn addition to the risk of data confidentiality, companies must also pay more attention to the compliance of training data. What does this mean? The data used to train large models must be legal and compliant and must not cross the legal red line. Especially in terms of privacy protection and intellectual property rights, if you are not careful, the company may cross the red line within the legal framework. If some data is only used for training and will not provide external services, then during the training process, it is necessary to manage the data life cycle and destroy the used data in a timely manner.At the same time, the quality of the data must be strictly controlled. This is for the quality of the large model. This is easy to understand. If the data quality is too poor, the level of the large model will naturally not be good.Risk control in model processingThere are many risk control issues in model processing. To some extent, it is not only a technical risk control issue, but also some requirements of internal control of the enterprise. At this stage, there are five main risk control issues that need to be paid attention to.Risk point 1: Model ethical risk controlBecause the AI training process is mostly a black box, we often don’t know how AI processes information, so large models often raise some scientific and technological ethics issues. The main thing about model ethics is to prevent AI from “learning bad things”.For example, in a recruitment scenario, AI may develop a bias against women when processing data because historical data tends to favor men; or because of excessive discrimination against a certain group of people on the Internet, AI may further reinforce such discrimination.Therefore, in model training, companies must use diverse data to correct bias, and also add some ethical review processes to the product development process to prevent the model from doing things that violate moral ethics and technological ethics. After all, no one wants their own AI to become a typical example of "moral decline"...Risk point 2: Model interpretabilityModel interpretability, in simple terms, means figuring out why the model makes such a decision.We are already very familiar with the training and use process of large models: tell AI about problems and requirements, and AI processes them and returns results. The results may be reasonable or completely unexpected. This uncontrollability is harmless for most scenarios, and can even help you unearth different inspirations through your wild imagination.However, in some scenarios, the uncontrollability of AI will become a hindrance. If the model is allowed to run as a black box, no one will know what it is thinking. If something goes wrong, there will be no way to explain it to the user. In more serious cases, regulators will come knocking.Moreover, chatting with AI is only a preliminary application. If AI is to be able to release productivity on a large scale, it still needs to be integrated into the workflow. For example, in financial solutions and medical recommendations, AI generates an answer. Why? What is the reference? What is the logic? What is the reason? How to infer this answer? These questions need to be answered with reason and evidence, that is, the inference results of the large model must be "explainable".To ensure interpretability, we must not only adjust the parameters of the model's rigor, but also intervene in the product process. For example, in the product process, it is mandatory to divide the AI processing process into several steps, and each step is mandatory to index all referenced materials, or set up monitoring tools at each step to ensure that the root cause can be found if there is a problem.Risk point three: industry compliance requirementsThese are mainly compliance requirements for specific industries and specific regions, such as clinical requirements in the medical industry, GDPR (EU cross-border data regulations) in EU countries, etc.Risk point 4: Internal control issues of computing power, cost, and operation and maintenanceEspecially in private deployment solutions, there are many places where "cost traps" can be formed within the enterprise. Therefore, enterprises must make good choices in internal control and technical solutions in their internal systems.The computing power required for training large models is very expensive, and even if a public cloud deployment solution is used, the cost is not low. Therefore, which technical implementation solution is more cost-effective? Which cloud service package has more advantages? Is the computing power idle? Is there any fraud in the computing power procurement process? These can be regarded as internal control concerns for large model deployment. Don't burn money without thinking.Risk Point 5: Model Failure and Performance DegradationYes, models can also degenerate.Once a large model is deployed, it is not a one-time solution. Even if we do not consider whether the new model surpasses the old model, there is still the problem of performance degradation when looking at the same model alone, and it may become "aged" over time.This type of aging is generally caused by data distribution drift, data cycle pollution, and other reasons. Data distribution drift means that the situation in the real world will change over time, but the data in the model will not, which will cause deviations (for example, some policies in the 20th century cannot be applied to the 21st century). Data cycle pollution means that after a large model communicates with a large number of users, the data quality is lowered, resulting in the intensification of the model's cocoon and increased deviation.Therefore, in dealing with model failure and degradation, measures such as monitoring and early warning mechanisms, data update and retraining strategies, and data disaster recovery and rollback mechanisms are all very important. At the same time, organizational and process guarantees must also be considered, such as a dedicated team and SOP to conduct quantitative and qualitative monitoring of the actual performance of the model.Risk control during model outputThis process is relatively easy to understand. The content output by the model also needs to be screened and filtered to meet the requirements of content compliance. This part of content risk control is essentially similar to the risk control of the time of information release on social media.Regarding this part of content risk control, I have previously written a series of long articles on the content risk control strategy of the Xiaohongshu platform, which can be used as a reference here (" TK refugees flock to Xiaohongshu, in-depth analysis of how to conduct content risk control ", public account "Product Variables", author: Hengheng, V: xiaozidaheng).In the output stage, traditional information compliance should pay attention to:1. The content at the policy level must be compliant and meet regulatory requirements.
2. Positive orientation of community atmosphere, such as verbal abuse, privacy, etc.
3. Guarantee the safety of community users.
Hengheng, WeChat Official Account: Product variable TK refugees flock to Xiaohongshu, in-depth analysis of how to conduct content risk control
The existing solutions for this part are very mature, and the supporting solutions of machine learning + manual review will basically not cause any problems.In AI scenarios, we also need to pay extra attention to the problem of "AI illusion". The current large models have this problem to a certain extent, and they like to "talk nonsense", such as making up non-existent facts or making up a data source. Therefore, in some serious business scenarios, enterprises need to do more verification in output inspection.Summary: Risk control is for better innovationConnecting to large models can bring efficiency and innovation to enterprises, but there are also many risks. From data confidentiality and compliance when entering the model, to explainability, ethics, industry standards, cost control and performance maintenance during processing, to content review and reducing illusions during output, enterprises must consider all aspects and proceed step by step.How to manage these risks? A complete AI governance framework is needed. Data must be strictly managed, compliance must be ensured, models must be selected correctly, deployment must be reasonable, and optimization must be done at any time. Only by controlling these risk control points can enterprises use AI with confidence, which is both safe and efficient.