Chroma, an open source AI native vector database essential for RAG implementation

Written by
Iris Vance
Updated on:June-29th-2025
Recommendation

Explore the AI ​​native vector database Chroma to help implement RAG technology and multimodal retrieval.

Core content:
1. The core concepts of Chroma database and its application advantages in RAG technology
2. Chroma installation and basic configuration methods
3. Chroma database addition, deletion, modification and query operation skills

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)

1. Chroma Core Concepts and Advantages

1. What is Chroma?

Chroma is an open source vector database designed for efficient storage and retrieval of high-dimensional vector data. Its core capability lies in semantic similarity search, supporting fast matching of embedded vectors such as text and images, and is widely used in scenarios such as large model context enhancement ( RAG ), recommendation systems, and multimodal retrieval . Unlike traditional databases, Chroma measures data relevance based on vector distance (such as cosine similarity and Euclidean distance) rather than keyword matching.

GitHub address:

https://github.com/chroma-core/chroma

Official documentation:

https://docs.trychroma.com/

2. Core advantages

  • Lightweight and easy to use : The code is embedded in the form of Python/JS packages, without the need for independent deployment, suitable for rapid prototyping development.

  • Flexible integration: supports custom embedding models (such as OpenAI, HuggingFace), and is compatible with frameworks such as LangChain.

  • High-performance retrieval : HNSW algorithm is used to optimize indexes, supporting millisecond-level responses for millions of vectors.

  • Multi-mode storage: Memory mode is used for development and debugging, and persistence mode supports data storage in production environments.

2. Installation and basic configuration

 1. Install Chroma

Support Windows and Ubuntu operating systems, Python>=3.9 

Create a virtual environment and install:

#Create a virtual environment conda create -n chromadb python==3.10#Activate conda activate chromadb#Install chromadb pip install chromadb

Note: Chroma is a local embedded database by default and does not natively support remote access like traditional databases (such as the client-server model of PostgreSQL).

Of course, the official also provides a client-server mode (Client-Server Mode). The server-side startup method is as follows:

#Start the server, the default port number is 8000chroma run --path /db_path

2. Initialize the client

 Memory mode (debugging, experimental scenarios):

import chromadbclient = chromadb.Client()

Persistence mode (production environment):

When creating, you can configure the local storage path

import chromadb# Save data to the local directory, fill in the absolute path of path client = chromadb.PersistentClient(path="/path/to/save")

Client-Server mode client:

The first two are local modes, and the Chroma server and client need to be on the same machine. The CS mode can be deployed independently and accessed through httpclient.

import chromadbchroma_client = chromadb.HttpClient(host='localhost', port=8000)

3. Add, delete, modify and query operations

1. Create a Collection

A collection is the basic unit for managing data in Chroma, similar to a table in a traditional database.  The name of a collection has the following constraints:

  • The name must be between 3 and 63 characters long.

  • The name must start and end with a lowercase letter or number and can contain dots, dashes, and underscores.

  • The name must not contain two consecutive dots.

  • The name cannot be a valid IP address.


Chroma collections are created with a name and an optional embedded function.
If you provide an embedded function, you must provide it each time you fetch the collection.


# Createcollection  = client.create_collection(name= "my_collection" , embedding_function=emb_fn)

# Getcollection  = client.get_collection(name= "my_collection" , embedding_function=emb_fn)

# If not created, get it if it existscollection  = chroma_client.get_or_create_collection(name= "my_collection2" )

If no embedding function is provided, the default embedding function sentence transformer is used. It uses a small model all-MiniLM-L6-v2, which is mainly for English scenarios. Generally, we need to customize an embedding function:

import  chromadbfrom  sentence_transformers  import  SentenceTransformer
class  SentenceTransformerEmbeddingFunction :    def  __init__ ( self, model_path:  str , device:  str  =  "cuda" ):        self.model = SentenceTransformer(model_path, device=device)    def  __call__ ( self,  inputlist [ str ] ) ->  list [ list [ float ]]:        if  isinstance ( inputstr ):            input  = [ input ]        return  self.model.encode( input , convert_to_numpy= True ).tolist()
# Create/load collections (including custom embedded functions)embed_model = SentenceTransformerEmbeddingFunction(    model_path= r"D:\Test\LLMTrain\testllm\llm\BAAI\bge-m3" ,    device = "cuda"   # No GPU, change to "cpu")# Create a client and collectionclient = chromadb.Client()collection = client.create_collection( "my_knowledge_base"                                      metadata={ "hnsw:space""cosine" },                                      embedding_function=embed_model)

When creating a collect, you can configure the following parameters.

  • name identifies the name of the collect and is a required field;

  • embedding_function, specify the embedding function. If it is not filled in, it will be the default embedding model.

  • metadata, such as indexing method, etc., is not required.


from datetime import datetimecollection = client.create_collection( name="my_collection", embedding_function=emb_fn, metadata={ "description": "my first Chroma collection", "created": str(datetime.now()) } )

There are some common methods for collections:

  • peek() - Returns a list of the first 10 items in a collection.

  • count() - Returns the number of items in the collection.

  • modify() - rename a collection


collection.peek() collection.count() collection.modify(name="new_name")

2. Write data

When writing data, configure the following parameters:

  • document, the original block of text.

  • metadatas, metadata describing the text block, kv key-value pair.

  • ids, unique identifier of the text block, each document must have a uniquely associated id. Adding the same id twice will result in only the initial value being stored.

  • embeddings: For text blocks that have been vectorized, you can directly write the results. If you do not fill it in, the specified or default embedding function will be used to vectorize the documents when writing.


collection.add( documents=["lorem ipsum...", "doc2", "doc3", ...], metadatas=[{"chapter": "3", "verse": "16"}, {"chapter": "3", "verse": "5"}, {"chapter": "29", "verse": "11"}, ...], ids=["id1", "id2", "id3", ...])

or

collection.add( embeddings=[[1.1, 2.3, 3.2], [4.5, 6.9, 4.4], [1.1, 2.3, 3.2], ...], metadatas=[{"chapter": "3", "verse": "16"}, {"chapter": "3", "verse": "5"}, {"chapter": "29", "verse": "11"}, ...], ids=["id1", "id2", "id3", ...])

3. Modify data

Provide ids (textual unique identifiers).

collection.update( ids=["doc1"], # Use existing ID documents=["RAG is a retrieval enhancement generation technology 222"])

4. Update Insert Method

Chroma also supports upsert operations, which update existing items or add them if they do not yet exist.

collection.upsert( ids=["id1", "id2", "id3", ...], embeddings=[[1.1, 2.3, 3.2], [4.5, 6.9, 4.4], [1.1, 2.3, 3.2], ...], metadatas=[{"chapter": "3", "verse": "16"}, {"chapter": "3", "verse": "5"}, {"chapter": "29", "verse": "11"}, ...], documents=["doc1", "doc2", "doc3", ...],)

5. Delete data

Chroma supports deleting item IDs from a collection using delete. The embeddings, documents, and metadata associated with each item will be deleted.

Also supports where filter. If no id is provided, it will remove the item in the collection with where filter.

# Provide idscollection.delete(ids=[ "doc1" ])
# where condition deletioncollection.delete(    ids=[ "id1""id2""id3" ,...],where ={ "chapter""20" })

6. Query data

(1) Query all data

all_docs = collection.get()print("All documents in the collection:", all_docs)

(2) Query by ids

An item ID can be retrieved from a collection using get in the following ways.

collection.get( ids=["id1", "id2", "id3", ...],where={"style": "style1"})

(3) Query Embedding

Chroma collections can be queried in a variety of ways using the query method, such as using query_embedding.

collection.query( query_embeddings=[[11.1, 12.1, 13.1],[1.1, 2.3, 3.2], ...], n_results=10, where={"metadata_field": "is_equal_to_this"}, where_document={"$contains":"search_string"})
  • The query will return n_result for each closest matching query embedding, in order.

  •  An optional where filter dictionary can be associated with each document via metadata. 

  • In addition, where document can provide a filter dictionary to filter based on document content.


(4) Query similar documents

You can also pass a set of query texts query_texts. Chroma will first embed each query text with the collection's embedding function and then execute the query using the resulting embeddings.

# Query similar documents results = collection.query( query_texts=["What is RAG technology?"], n_results=3) print("Query results", results)

Query result configuration

  • When using get or query, you can use the include parameter to specify the data you want to return, including: embeddings, documents, metadatas ; include is an array and can pass multiple values.

  • For query query, the distances result is returned by default .

  • For performance reasons, embeddings is not returned by default and None is displayed directly. If you want to return it, include embeddings in include .

  • An ID is always returned.

  • The return value contains the included parameter, which indicates the types of data returned this time.

  • The embeddings will be returned as a 2D NumPy array.

# Only get documents and idscollection.get( include=["documents"])collection.query( query_embeddings=[[11.1, 12.1, 13.1],[1.1, 2.3, 3.2], ...], include=["documents"])

Example of query results

{'ids': [['doc1', 'doc3', 'doc2']], 'embeddings': None, 'documents': [['RAG is a retrieval-enhanced generation technology', 'Three Heroes Fighting Lü Bu', 'Vector database stores embedded representations of documents']], 'uris': None,'included': ['metadatas', 'documents', 'distances'], 'data': None, 'metadatas': [[{'source': 'tech_doc'}, {'source': 'tutorial1'}, {'source': 'tutorial'}]], 'distances': [[0.2373753786087036, 0.7460092902183533, 0.7651787400245667]]}

4. Practical Operation

Insert a batch of data into the vector database, and then find similar data from the vector database based on a question.

 1. Installation package

pip install sentence_transformerspip install modelscope

2. Download the Embedding model to your local computer

#Model downloadfrom modelscope import snapshot_downloadmodel_dir = snapshot_download('BAAI/bge-m3',cache_dir=r"D:\Test\LLMTrain\testllm\llm")

3. Core logic: writing data and querying similarity

import chromadbfrom sentence_transformers import SentenceTransformerclass SentenceTransformerEmbeddingFunction: def __init__(self, model_path: str, device: str = "cuda"): self.model = SentenceTransformer(model_path, device=device) def __call__(self, input: list[str]) -> list[list[float]]: if isinstance(input, str): input = [input] return self.model.encode(input, convert_to_numpy=True).tolist()# Create/load collection (including custom embedding function)embed_model = SentenceTransformerEmbeddingFunction(model_path=r"D:\Test\LLMTrain\testllm\llm\BAAI\bge-m3", device="cpu" # Change to "cpu" if no GPU, cuda if available)# Create client and collectionclient = chromadb.PersistentClient(path=r"D:\Test\LLMTrain\chromadb_test\chroma_data")collection = client.get_or_create_collection("my_knowledge_base", metadata={"hnsw:space": "cosine"}, embedding_function=embed_model)# Add documentscollection.add( documents=["Embedding representation of documents stored in vector databases", "Three Heroes Fighting Lu Bu","RAG is a retrieval enhancement generation technology"], metadatas=[{"source": "tech_doc"}, {"source": "tutorial"}, {"source": "tutorial1"}], ids=["doc1", "doc2", "doc3"])# Query similar documentsresults = collection.query( query_texts=["What is RAG technology?"], n_results=3)print("Query results", results)

Execution returns the result:

Query results {'ids' : [[ 'doc3''doc2''doc1' ]], 'embeddings'None'documents' : [[ 'RAG is a retrieval enhancement generation technology''Three heroes fighting Lu Bu''Vector database stores embedded representation of documents' ]], 'uris'None'included' : [ 'metadatas''documents''distances' ], 'data'None'metadatas' : [[{ 'source''tutorial1' }, { 'source''tutorial' }, { 'source''tech_doc' }]], 'distances' : [[ 0.23737537860870360.74600929021835330.7651787400245667 ]]}
Looking at the results, we focus on distances. The values ​​are sorted from small to large, so the similarity between the three pieces of data and the question "What is RAG technology?" is:  the smaller the distances value, the more similar . Therefore, the first piece of data is most similar to the question.