Three data formats of agent knowledge base

Deeply analyze the mystery of data format in the intelligent agent knowledge base and explore how AI intelligent agents can efficiently use knowledge.
Core content:
1. Application of structured data in knowledge base and its examples
2. Storage form and typical application scenarios of semi-structured data
3. Processing technology and practical cases of unstructured data
“ A knowledge base is a systematic data storage structure that stores, organizes, and retrieves knowledge , supporting AI agents to complete tasks in specific scenarios. It stores knowledge in a variety of data formats, including structured data, semi-structured data, and unstructured data . ”
The core goal of the knowledge base is to convert external knowledge into a data form that can be called by the model, so as to facilitate retrieval, matching and reasoning by the intelligent agent , and improve the understanding and accuracy of answers to complex problems.
01
—
Structured Data Knowledge Base
Definition: Structured data is stored in the form of tables and relational databases (such as MySQL and PostgreSQL) . The data has clear row and column formats and field definitions , and is suitable for scenarios such as FAQ question-and-answer systems and rule matching .
Application scenarios:
Traditional FAQ question-answering system
Product parameter matching and query
Filling information slots in multi-round conversations
Example: Structured data for a college admissions question-and-answer system
question | Answer | Key fields |
What is the admission score? | The admission score for science majors in 2023 is 580 points | Score, year |
Are there any scholarships available? | Provide various scholarships, up to 5,000 yuan/year | Scholarship Type |
What are the majors? | Including 30 majors such as computer science, economics and management, and medicine |
02
—
Semi-structured data
Definition: Semi-structured data lies between structured and unstructured data. It is usually stored in JSON, XML or YAML format. The data fields are not fixed and it is suitable for scenarios of dynamic knowledge retrieval and multimodal data analysis .
Application scenarios:
Knowledge graph construction
API response data parsing
Multi-dimensional data retrieval
Example: Semi-structured data of intelligent customer service knowledge base
{ "Question": "How to return goods?", "Answer": { "Return process": ["Apply for return", "Send back the goods", "Confirm refund"], "Return period": "7 days no-reason return" }, "Category": "After-sales service"}
—
Unstructured Data
Definition: Unstructured data includes text, audio, video, pictures and other data forms without a fixed format, which need to be parsed and retrieved in combination with NLP, OCR and other technologies.
Application scenarios:
Document analysis and Q&A
Video content summarization and knowledge extraction
Image OCR analysis and content annotation
Example: Unstructured data of internal corporate policy documents
"Company holiday arrangements for 2024: Spring Festival holiday is from January 21 to January 27, and other statutory holidays are implemented in accordance with national regulations."