DeepSeek Open Source Week 5: 3FS, a high-performance distributed file system

Exploring the future of AI storage, 3FS breaks through traditional limitations with its superior performance.
Core content:
1. 3FS's split architecture and extreme use of modern hardware
2. 3FS's multi-scenario adaptability throughout the AI life cycle
3. 3FS's hardcore performance data and its deep understanding of AI workloads
" Today, deepseek-ai has launched the open-source distributed file system 3FS (Fire-Flyer File System), which claims to have an amazing performance of 6.6 TiB/s throughput and 110.5 TiB sorting in 30 minutes, completely solving the AI storage bottleneck. "
3FS official warehouse address:
https://github.com/deepseek-ai/3fs
01
—
The core secret of 3FS——Technical deconstruction
The killer feature of 3FS is its decoupled architecture and the extreme use of modern hardware. It combines the throughput of thousands of SSDs with the high bandwidth of RDMA networks to build a shared storage layer without local restrictions. In short, no matter where the data is, the access speed is as fast as local.
Strong consistency: This is achieved through CRAQ (Chain Replication with Apportioned Queries), which ensures that data is not disordered even in a distributed environment.
Stateless metadata : Based on FoundationDB's transactional key-value storage, developers do not need to learn new APIs, and the access cost is almost zero.
Multi-scenario adaptation: From data preparation to random access to training samples, to high-throughput checkpoints and inference KVCache, 3FS covers almost the entire life cycle of AI.
This design makes 3FS not only a file system, but more like a "data accelerator" tailored for AI.
02
—
3FS hardcore performance
The performance data of 3FS is jaw-dropping:
Peak throughput : 180 storage nodes + more than 500 clients, aggregate read throughput up to 6.6 TiB/s, even with background traffic.
GraySort test : 25 storage nodes + 50 computing nodes, sorting 110.5 TiB in 30 minutes and 14 seconds, an average of 3.66 TiB per minute, which is the "speed king" of distributed sorting.
KVCache reasoning : With a peak read throughput of 40 GiB/s, it is cheaper and has a larger capacity than traditional DRAM cache, making it a perfect fit for reasoning scenarios.
Behind these numbers is 3FS's deep understanding of AI workloads - it is not only fast, but can also stably handle high concurrency and complex tasks.
03
—
3FS's ambition and potential
The goal of 3FS is obviously not just to be a "fast storage". It attempts to simplify the development process of distributed AI applications through a unified shared storage layer. Imagine: no need to pre-fetch data, no need to manually shuffle samples, checkpoints completed in seconds, and inference cache costs cut in half - this is simply a dream scenario for AI engineers.
More importantly, the open source nature of 3FS allows developers to customize it freely. It may become an important piece of the puzzle in the AI ecosystem and even challenge the existing distributed file system landscape.