Enterprise Knowledge Brain Driven by Knowledge Graphs in the Era of LLMs

How can large language models and knowledge graph technology reshape enterprise knowledge management? An article takes you to deeply understand the construction and application of the enterprise knowledge brain.
Core Content:
- Challenges and Opportunities of Knowledge Management
- Architecture Design of Enterprise Knowledge Brain
- Key Technologies and Application Scenarios of Enterprise Knowledge Brain
Introduction The topic of this sharing is the enterprise knowledge brain driven by a knowledge graph in the era of large language models.
1. Challenges and opportunities of knowledge management
2. Enterprise knowledge brain architecture
3. Key Technologies of Enterprise Knowledge Brain
4. Typical applications of Enterprise Knowledge Brain
Sharing Guests|Baby Jin, Big Data R&D Engineer, China Telecom Artificial Intelligence Technology (Beijing) Co.
Editing|Tianxing Li
Proofreader: Li Yao
Community|DataFun
01
Challenges and Opportunities of Knowledge Management
1. Enterprise Knowledge Management Challenges
According to the analysis of the authoritative organization IDC , the global data volume will further grow in 2025 , and the proportion of unstructured data (unexplored data) will further increase. Current data presents four major characteristics, namely large data volume, high proportion of unstructured data, low mining rate, and loose data organization due to multimodal and other data forms.
The above data characteristics bring many challenges to enterprise data management, storage, and mining:
-
In terms of management, faced with massive data from different sources, they will face the problem of unified characterization and data integration, as well as the challenge of authority management.
-
Storage is facing the challenge of how to effectively store massive unstructured data and how to reduce data redundancy between different systems.
-
On application, facing the challenges of extracting useful information from massive data, mining the deep connection of data, and presenting complex data in a visualized form.
2. Enterprise Knowledge Management Opportunities
The development of large language models and knowledge graph technology brings new opportunities for enterprise knowledge innovation and intelligent management.
From the perspective of a large language model, the basic language model has made significant progress in terms of language understanding, generation capability, multimodal capability, higher-order reasoning capability, model processing capacity, etc. In particular, the DeepSeek R1 model has a capability close to that of the o1 model, and at the same time, because it is an open-source model, the cost of invocation has been reduced by 90%-95% . The advantages of high performance and low cost enable the rapid application of large language models in various industries, and also increase the feasibility of the integration of knowledge graphs and large language models in practical applications.
From the point of view of a knowledge graph, a graph is a graph-structured data organization, management, query, and computation technology that provides an effective way to characterize data from different sources, different structures, and different modalities. The development of large language model technology makes the knowledge construction, knowledge fusion, knowledge complementation, and knowledge application of the graph more intelligent and automated.
The combination of graphs and a large language model provides a new path for enterprise knowledge management while reducing landing costs.
The Star Ocean series products launched by China Telecom Artificial Intelligence Company provide a complete capability system for building an enterprise knowledge brain. Next, its functional architecture and key technologies are introduced in detail.
02
Enterprise Knowledge Brain Architecture
1. Enterprise Knowledge Brain: Functional Architecture
The functional architecture of the Enterprise Knowledge Brain product consists of the basic model, mapping platform, application platform, and business scenario layers.
(1) The model platform is supported by the Star Ocean AI center, which provides the ability to manage the full life cycle of large and small models and is the model engine of the knowledge brain. The large models managed on the platform include DeepSeek, the Star Semantic Large Model, and the Star Multimodal Large Model developed by Telecom, and the small models include document parsing, machine translation, OCR, and a series of models related to knowledge management applications.
(2) The mapping platform is supported by the Star Ocean Knowledge Mapping Platform, which provides high-quality knowledge mapping capabilities, including knowledge construction, knowledge complementation, knowledge management, and knowledge quality, and mapping analysis capabilities, including knowledge computing, mapping Q&A, multimodal search, and visualization analysis. The mapping platform is the knowledge center and intelligent analysis engine of Knowledge Brain, which is also the most core part of Knowledge Brain.
(3) The application platform includes Xinghai Zhiwen Knowledge Base, Xinghai Intelligent Data Analysis, Intelligent Dialogue, Seat Assistant, and other products, which provide support for policy interpretation, public security research and judgment, medical assistance, and other scenarios.
2. Enterprise Knowledge Brain: Core Data Flow
The core data flow of the Starfish knowledge mapping platform is shown in the figure above. Multimodal data such as forms, text, voice, etc., are extracted by knowledge to build a basic map, and then refined by knowledge processing to improve the map information and enhance the quality of the map, and multimodal search, visual exploration, intelligent Q&A, and other functions are provided under the support of high-quality map data. Among them, the intelligent Q&A is based on the ability of a large model, supporting multiple jump reasoning, content summary, data statistics, comparative analysis, and other Q&A scenarios. The module can be run independently or seamlessly docked to policy interpretation, public security research, judgment, and other industry applications through an API interface, realizing the in-depth value transformation of the knowledge map.
03
Key Technologies of Enterprise Knowledge Brain
1. Graph Platform Functions - Intelligent Modeling Driven by Large Models
Before the wide application of large language model, the modeling of atlas was mainly done manually in a top-down way, and the modeling of specific scenarios often had to go through multiple steps , such as business research, data analysis, Schema combing, Schema confirmation, etc., each of which required the participation of professionals, and the modeling process was often repeatedly modified due to the deviation of the understanding of the scenarios from that of the business side. This kind of manual modeling requires high professionalism of modelers and long modeling time, which significantly pushes up the overall human cost of the project. With the help of visual modeling tools, the labor cost can be reduced to a certain extent, but it is still necessary for human beings to be deeply involved in the business and data sorting.
The excellent language understanding, generation, and powerful generalization ability of large language models provide new ideas for graph modeling. Automated modeling of graphs supported by a large language model can greatly reduce labor costs and alleviate the problems of slow landing and high costs of graph-related projects.
The intelligent modeling process driven by a large language model consists of five stages: preprocessing, document slicing, Schema extraction, Schema fusion, and Schema complementation. The preprocessing stage mainly completes the work of modal conversion and calibration of raw data. Document slicing completes the splitting of text, and in practice, there will be split by line, split by separator, and split by semantic block, etc. Schema extraction completes the work of extracting the Schema in each slice, and the preliminary extracted Schema is oriented to each slice, and it needs to carry out Schema fusion and other work. Schema fusion can also be done with the help of a large model for completing the Schema information after fusion. The above work can be done with rules or traditional small models, or with large models. LLM can provide better effects and stronger generalization ability, and can also effectively reduce the scene migration cost.
Compared to rules and traditional small models, large models have strong generalizability capabilities, but the modeling process requires multiple invocations of large models, which is less efficient. This is especially true in situations where a large amount of initial data exists in the enterprise.
There are two main ways to improve the modeling efficiency of LLMs:
(1) Improve model processing capability through model acceleration and model multi-instance deployment, but this solution has high hardware requirements and is more suitable for enterprises with sufficient hardware budget in the field of large models.
(2) Big Model Agent solution. The large language model acts as a pivot in the Schema construction task, and does not do specific Schema extraction, fusion, complementation, and other tasks, but decides which tools to call to realize, and the specific work is completed by the original small model or business rules. Its limitation is that it needs to develop callable rules or small model tools for specific scenarios, and its generalization is relatively poor.
Take the extraction of the introductory material of the Star Ocean Knowledge Graph platform as an example to introduce the specific process of intelligent modeling. First of all, it is necessary to perform text slicing. Here we use paragraph-level text slicing, while retaining the slicing order and hierarchy, which is combined by hierarchy and then extracted with a large model. The extraction of each text will bring out the above content to avoid the loss of key body information, and each text is extracted from the corresponding Schema.
The Schema extracted from a single paragraph may have semantic repetition and missing information, for example, the entities extracted from two text fragments are "Knowledge Graph Platform" and "Knowledge Graph" respectively, although the two expressions are the same concept, the text contents used are different, therefore, in the sub-schema extraction process, we can use a different Schema to extract the content of the two text fragments. For example, the entities extracted from the two text fragments are "Knowledge Graph Platform" and "Knowledge Graph" respectively, although they express the same concept, the content of the text used is different.
When fusing, we need to consider that if there are more fragments, we may not be able to integrate all the fragments into the large language model because of exceeding the limit of large language model token, so we need to formulate different fusion strategies based on the number of fragments, and if there are more fragments, we need to perform textual or vector clustering on the Schema extracted from individual fragments before fusion. On the basis of fusion, we need to supplement the missing information, such as in the example, on the basis of knowledge fusion, we have got a Schema, at this time, we have not yet generated the Schema each attribute type, and the attribute type is very important for the subsequent data extraction and accession, and we need to predict the attribute type represented by the Schema based on the large language model.
2. Graph Platform Functions - LLM Driven Intelligent Knowledge Extraction
LLM-driven intelligent knowledge extraction mainly consists of four steps: preprocessing, document slicing, knowledge extraction, and knowledge fusion. LLMs can be used in each stage, and the efficiency of knowledge extraction can also be improved by a pure large language model or a large language model agent. In the same modeling, a pure large language model solution has high demand for hardware resources, which requires us to make an effective trade-off between hardware cost and extraction flexibility when considering specific solutions.
Chunk extraction is the same as intelligent modeling, the extracted chunks are also sent to the large language model for knowledge extraction.
In the figure above, we can see that each chunk extracts the description, advantages, and functions of the Star Ocean Knowledge Graph platform. Although each segment extracts the content, there is missing information in each segment, and through knowledge fusion, the same entity in different chunks is organically fused into a single entity, which results in a succinct and complete graphical data. In addition to extracting from the original data, it is also possible to utilize the rich internal knowledge of the large language model to complete the knowledge graph.
3. Graph Platform Functions-Graph Q&A and Reasoning Driven by Big Models
Graph Q&A is one of the graph features with the highest attention in the industry at present. In traditional graph Q&A, a pipeline process is constructed based on the scenarios or data characteristics, including question parsing, intent recognition, customized query statement generation, result querying, result combination, step combination processing, etc. The traditional solution is characterized by a high degree of customization. The traditional scheme is characterized by a high degree of customization, difficult to quickly migrate a mature scheme to other scenarios, high construction costs, and poor reusability.
Big model-based graph Q&A mainly includes the following links: constructing a high-quality knowledge base through four links: Schema extraction, knowledge extraction, knowledge fusion, and knowledge mining, and constructing a graph Q&A Agent based on the graph knowledge base. The Agent accepts the user's question, and based on the question parsing capability, it formulates the operations required for answering the question, and then performs the specific operations. These operations include checking entities, checking relationships, etc. After the operation, based on the results, the Agent determines whether it is necessary to adjust the subsequent operation plan until the final answer is obtained.
Agent supports a variety of Q&A types, and significantly reduces the illusion problem of pure large language model solutions by introducing graphical knowledge. It can also graphically present the complete reasoning path to help users better judge the reliability of the results.
4. Graph Platform Functions-Graph Query Statement Generation Driven by Large Models
The syntax of mainstream graph query languages is already very close to that of SQL, which is easy to learn. However, it is still difficult for non-technical people to master the graph query language, so the function of converting natural language to graph query language is still very necessary. The conversion of natural language to graph query language can be realized in two simple steps. First, the large language model combines questions and Schema to generate preliminary graph query statements. Next, the preliminary graph query statements are calibrated.
The specific calibration process includes statement parsing, multiplexed recall, and rearrangement.
5. Graph Platform Functions - GraphRAG
GraphRAG is one of the most intuitive and effective ways to combine knowledge graphs with a large language model. In addition to sending raw graph data to the large language model, communities in the graph are mined through community discovery algorithms such as label propagation, and descriptive reports of the communities are generated by the large language model, and these descriptive reports are sent to the large language model together as a data source. Through the combination of raw graph data and community reports for RAG, both local and global information questions can be answered.
6. Application Platform Functions - Visual Business Process Orchestration
Business process orchestration can automate the execution of tedious and repetitive tasks, improve work efficiency, and reduce the cost and error rate of manual intervention. Through visual process editing tools, enterprises can freely design workflows according to actual needs and adjust and optimize them at any time.
7. Application platform function-dynamic configuration to create personalized robots
The application platform should have the ability to dynamically configure, which can be adapted to different large models, modify prompt words, adjust parameters, mount the knowledge base, etc., so as to efficiently meet the needs of diverse scenarios and create a more personalized enterprise knowledge brain.
04
Application Cases
Finally, we would like to share an application case of Knowledge Brain.
The traditional knowledge base can only provide a simple query function for discrete knowledge points, lacking the global nature of knowledge and in-depth understanding of the knowledge network, limiting the effective mining and utilization of knowledge.
In order to make up for the shortcomings of traditional knowledge organization, the enterprise knowledge brain is introduced, which is supported by the Star Ocean Knowledge Graph as an important support. Among them, the knowledge map serves as the role of knowledge reserve, based on which the enterprise knowledge brain also has powerful query analysis capability and fusion reasoning capability. Ultimately, it realizes high-quality knowledge organization and global knowledge acquisition, and enhances the ability of knowledge traceability, vein analysis and inference supplementation, providing a solid intelligent infrastructure support for enterprises to build knowledge-driven decision-making systems and improve core competitiveness in the digitalization process.