Want to make FastGPT even more powerful? Try the OceanBase vector database!

Explore a new solution to improve FastGPT performance and how OceanBase vector database can be its best partner.
Core content:
1. The key role of vector database in RAG and its impact on FastGPT performance
2. The limitations of PostgreSQL in high-dimensional vector models and complex retrieval logic
3. The advantages and application scenarios of OceanBase as a FastGPT vector database
One of the core magic of RAG is the vector database .
RAG's "memory brain" is responsible for converting massive amounts of knowledge into vector storage and efficiently retrieving the most relevant knowledge fragments when users ask questions.
These retrieval results are the key to driving large models to generate high-quality answers, which directly affects the overall effect of RAG applications.
FastGPT has always recommended that you use PostgreSQL by default (withpgvector
Extension) as the "memory brain". It has to be said that PostgreSQL is a very excellent and robust open source database , and it performs well in most scenarios.
However, as your applications grow larger and the amount of data increases, or when you start to play with higher-dimensional vector models (such as embeddings that are often thousands of dimensions) and need more complex retrieval logic (for example, you want to filter by category first and then find similar content), you will find that PostgreSQL has some difficulties in vector processing.
We have also been thinking about how to make FastGPT stronger! So, today we bring you good news:
Now, in addition to PostgreSQL, you can also choose OceanBase as the vector database for FastGPT!
For users who pursue extreme performance, distributed scalability, and excellent ease of operation and maintenance, OceanBase is your ideal partner for using FastGPT in large-scale or complex application scenarios.
Limitations of PostgreSQL
In the process of using PostgreSQL (pgvector) in depth, we have indeed found some areas that are not very smooth for developers (including ourselves). It is not that PG is bad, but in the face of the "new species" of vector data, this "old veteran" sometimes seems a little overwhelmed.
The specific performance is as follows:
Vector dimension limit: "High-dimensional models cannot be accommodated"
Have you ever encountered this situation: you excitedly trained an embedding model with super good results, but found that its dimension is 2048 or higher? When you try to store it in the PG HNSW index, you are dumbfounded - PG HNSW only supports up to 2000 dimensions at full precision! For the part exceeding the dimension, you either have to reduce the dimension (which may lose precision) or just sigh at the "dimension".The pitfall of hybrid search: "I want to find something precisely, but why is it so difficult?"
This may be one of the most troublesome problems for RAG developers. Often, we don't just want to "find the most similar ones", but we want to find the most similar ones under specific conditions . For example, "find only the paragraphs most related to 'database optimization' under the category of 'technical documents'". This requirement of "filtering first, then searching" is hybrid search .
However, PG's HNSW index (at least inpgvector
0.8) does not natively support this kind of mixed filtering directly at the index level. You may have to use HNSW to recall a large number of vectors first, and then perform secondary filtering at the application layer or database level. This is not only inefficient, but more importantly, if your data is frequently deleted and modified (which is quite common in vector databases), PG's old data (dead tuples) may interfere with HNSW's recall, causing you to find that the data you really want is missing after filtering !
Even if you upgrade topgvector
0.8+ introduced recursive search to alleviate this problem. We found through actual testing that performance may become slower, and sometimes the index of a query that was originally able to run suddenly becomes ineffective, and you have to work hard to modify the SQL, which is indeed a bit of a "pitfall" in terms of experience.VACUUM
The "pain" of "space recovery cannot keep up"
PG dependencyVACUUM
The mechanism is used to reclaim the space that is no longer in use (such as the "pits" left after you delete or update data). This mechanism works well for traditional text and numeric data. But vector data is different. It is a "big guy". A vector can easily be several KB or even more.
When your RAG application has a large amount of data and the deletion and modification operations are quite frequent, you will find that the database file is expanding like a balloon. At this time, PG'sVACUUM
There may be a bit of "indigestion", and the recovery speed cannot keep up with the speed of data update. The result is that either you have to endure the waste of disk space, or you have to manually and frequently executeVACUUM FULL
(This thing will also lock the table), or be forced toautovacuum
Allocating more system resources makes operation and maintenance a bit tiring.
Why choose OceanBase?
OceanBase has the following solution:
Easily handle 4096 dimensions, and even higher!
OceanBase's vector index supports vectors up to 4096 dimensions by default! This already covers the needs of most mainstream embedding models on the market. Moreover, this upper limit can be configured and expanded ! This means that you can safely choose a higher-dimensional model to pursue better results, and no longer have to sacrifice model accuracy to reduce dimensionality due to database limitations.Native hybrid search: accurate, efficient, and done in one step!
This is definitely OceanBase's killer feature! Its vector index natively supports hybrid search . In other words, you can directly tell it when querying: "Hey, help me find the vector that is most similar to 'this description' under 'this category' and 'that label'". OceanBase can perform precise scalar filtering at the index level while performing efficient vector similarity search .
The benefits of doing so are obvious:
Precision: Define the scope before searching to ensure that you find what you really want and say goodbye to the worry of data loss.
Efficient: The index layer handles it directly, avoiding the overhead of secondary filtering at the application layer, and the query speed is so fast! You no longer have to worry about index failure or write complex SQL to avoid pitfalls.
Space recovery is more "intelligent": automatic management, worry-free and labor-saving!
OceanBase uses a different architecture from PG at the bottom layer (based on LSM-Tree). This architecture is unique in processing data additions, deletions, and space recovery. It has a more complete and automated space recovery mechanism , which is more friendly to data types such as vectors that are large in size and may be updated frequently.
In short, you basically don't have to worry about PG like before.VACUUM
OB will handle space recovery more smoothly and efficiently in the background, reducing the trouble of database expansion and greatly reducing your operation and maintenance burden. You can focus more on business logic instead of "fighting" with database space every day.Single-table multi-column index support: If your business scenario requires indexing multiple different vector columns (such as title vectors, content vectors) on the same table, OceanBase can also support it well.
Distributed genes: OceanBase is a distributed database by nature, and has natural advantages in horizontal scalability and high availability under high concurrency and large data volumes. (Although we may not be able to use FastGPT on such a large scale for the time being, it is always good to know that it has this potential.)
A good partner for domestic information and innovation projects: OceanBase is a very good choice for projects with localization and information and innovation requirements.
Embrace higher-dimensional, more powerful models.
Achieve more accurate and efficient hybrid retrieval.
From the cumbersome
VACUUM
Freed from maintenance.An active Sealos Cloud account.
Understand the basic deployment concepts of OceanBase and the OceanBase Docker image version you plan to use. You may need to refer to the OceanBase official document https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000002013494 for instructions on containerized deployment or standalone deployment.
Create an application in the Sealos Cloud console: Log in to Sealos Cloud (https://sealos.run), enter the Sealos Cloud console, navigate to the "Application Management" or "Application Launchpad" function module, click the "New Application" or "Create Stateless Service" button to start configuring a new application deployment.
Configure basic application information and network: refer to the interface shown in the figure for configuration
Image Name: Enter the name of the OceanBase database Docker image you selected. Please make sure that the image you use is provided by an official or reliable source and indicate the version (e.g.
oceanbase/oceanbase-ce
).App Name: Set an easily recognizable name for your OceanBase deployment.
Computing resources (CPU/memory): Configure appropriate CPU and memory resources based on OceanBase's resource requirements and your usage scenarios.
Network Configuration: Enable TCP protocol port exposure. Click "Add Port" and set the container port to the default listening service port of OceanBase database, which is usually
2881
. Set the corresponding service port and select the appropriate exposure method (for example, choose public network exposure so that external clients can connect to your database).OB_SERVER_IP
: The service IP address of the OceanBase node.OB_TENANT_NAME
: The name of the tenant to create or connect to.OB_TENANT_PASSWORD
: The tenant's password.OB_SYS_PASSWORD
: Password for the SYS tenant (used for administration).Core files such as database data and logs need to be stored persistently to prevent data loss due to container restart or deletion.
In the Local Storage or Persistent Storage area, refer to the mount configuration shown in the figure below to mount a persistent storage volume for the OceanBase container.
Click "Add Storage Volume" , select or create a storage volume, and mount it to the key path inside the OceanBase container that stores data, logs, and configuration files. Be sure to consult the official documentation of the OceanBase image you are using to confirm which specific paths need to be mounted.
Click the Deploy or Launch button at the bottom of the page.
Sealos will start pulling the OceanBase image and create and start containers according to your configuration. The database startup and initialization process may take some time.
After successful deployment, check the status of your OceanBase application instance on the Application Details page of the Sealos console.
Check the log output to confirm that the OceanBase database has been successfully started and initialized.
Get the external access address (URL) and port assigned by Sealos.
You can now use OceanBase client tools such as ODC - OceanBase Developer Center or command line client
obclient
) connect to that address and port and use the username and password you set in the environment variables to access and manage your OceanBase database.Easily manage ultra-high dimensional vectors (4096 dimensions by default, and can be higher), so that your model selection is no longer limited.
It natively supports efficient and accurate hybrid retrieval , and complex queries can be completed in one step, quickly and accurately.
A smarter, automated space reclamation mechanism that says goodbye to
VACUUM
Worries are eliminated and operation and maintenance is easier.The dimension of your embedding model is relatively high (for example, more than 2000 dimensions).
You need to frequently perform mixed searches (for example, first filter by user, department, time, etc., and then find similar content).
Your knowledge base has a large amount of data, and update and deletion operations are relatively frequent.
You have relatively high requirements for the query performance and scalability of RAG applications.
You want to minimize the hassles of database operation and maintenance (especially space reclamation) as much as possible.
The project has requirements for localization and information technology innovation.
Your project is relatively small in scale, with low data volume and QPS.
Mainly performs simple vector similarity searches, with few or uncomplicated hybrid retrieval requirements.
The vector dimensions used are not high (less than 2000 dimensions).
The team is very familiar with PostgreSQL, and the current operation and maintenance pressure is not great.
OceanBase also has some additional points:
In summary, choosing OceanBase as the vector database for FastGPT means you can:
Doesn’t it sound delicious? ?
Installation and deployment tutorial
Preparation:
Deployment steps:
3. Configure environment variables: In the Advanced Settings or Environment Variables area, add the environment variables required to start and configure the OceanBase instance according to the official documentation of the OceanBase Docker image you are using. For example:
4. Configure persistent storage:
5. Carefully check all configuration items, including image name, resources, ports, environment variables, and storage mount paths to ensure that the information is accurate.
6. Verify the deployment and connect to the database:
Connect FastGPT to this OceanBase instance
Important reference documents:
FastGPT official documentation: https://doc.fastgpt.cn/docs/
FastGPT GitHub Docker example (OceanBase): https://github.com/labring/FastGPT/tree/main/deploy/docker/docker-compose-oceanbase
Summary and Outlook
This time, FastGPT and OceanBase joined hands to provide a better vector database option for those who pursue higher performance, stronger functions, and easier operation and maintenance . Although PostgreSQL (pgvector) performs well in many scenarios, it does encounter some challenges when dealing with high-dimensional vectors, complex mixed retrieval, and space management under large-scale data.
OceanBase has demonstrated strong capabilities in these areas:
It can be said that the addition of OceanBase has actually solved some of the pain points that we (and you!) may encounter when using FastGPT, allowing your RAG applications to reach a higher level in terms of performance, functionality, and ease of use.
So, when should you use OceanBase?
Here are some tips for you:
The final choice is, of course, in your hands! We hope to provide more possibilities so that you can choose the most suitable "memory brain" for FastGPT according to your specific needs.


