Three data formats of agent knowledge base

Written by
Iris Vance
Updated on:June-27th-2025
Recommendation

Deeply analyze the mystery of data format in the intelligent agent knowledge base and explore how AI intelligent agents can efficiently use knowledge.

Core content:
1. Application of structured data in knowledge base and its examples
2. Storage form and typical application scenarios of semi-structured data
3. Processing technology and practical cases of unstructured data

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

A knowledge base is a  systematic data storage structure that stores, organizes, and retrieves knowledge  , supporting AI agents to complete tasks in specific scenarios. It stores knowledge in a variety of data formats, including  structured data, semi-structured data, and unstructured data .  


The core goal of the knowledge base is to convert external knowledge into a data form that can be called by the model, so as to facilitate retrieval, matching and reasoning by the intelligent agent , and improve the understanding and accuracy of answers to complex problems.


01

Structured Data Knowledge Base


  • Definition:  Structured data  is stored in the form of tables and relational databases (such as MySQL and PostgreSQL)  . The data has clear  row and column formats and field definitions , and is suitable for  scenarios such as FAQ question-and-answer systems and rule matching  .

  • Application scenarios:

    • Traditional FAQ question-answering system

    • Product parameter matching and query

    • Filling information slots in multi-round conversations


Example: Structured data for a college admissions question-and-answer system


question

Answer

Key fields

What is the admission score?

The admission score for science majors in 2023 is 580 points

Score, year

Are there any scholarships available?

Provide various scholarships, up to 5,000 yuan/year

Scholarship Type

What are the majors?

Including 30 majors such as computer science, economics and management, and medicine



02


Semi-structured data


  • Definition:  Semi-structured data lies between structured and unstructured data. It is usually stored in  JSON, XML or YAML  format. The data fields are not fixed and it is suitable for  scenarios of dynamic knowledge retrieval and multimodal data analysis  .

  • Application scenarios:

    • Knowledge graph construction

    • API response data parsing

    • Multi-dimensional data retrieval

Example: Semi-structured data of intelligent customer service knowledge base

{ "Question": "How to return goods?", "Answer": { "Return process": ["Apply for return", "Send back the goods", "Confirm refund"], "Return period": "7 days no-reason return" }, "Category": "After-sales service"}



03


Unstructured Data



  • Definition:  Unstructured data includes  text, audio, video, pictures and other  data forms without a fixed format, which need to be parsed and retrieved in combination with NLP, OCR and other technologies.

  • Application scenarios:

    • Document analysis and Q&A

    • Video content summarization and knowledge extraction

    • Image OCR analysis and content annotation

Example: Unstructured data of internal corporate policy documents

"Company holiday arrangements for 2024: Spring Festival holiday is from January 21 to January 27, and other statutory holidays are implemented in accordance with national regulations."