An introduction to XET, Hugging Face's storage system (part 1)
Summary
This article introduces XET, Hugging Face's content-addressable storage system with chunk-level deduplication. It explains chunking, xorbs, and Merkle-style integrity checks to optimize storage and CDN transfers for large binaries, plus potential applications beyond Hugging Face. It also situates XET as a general solution for efficient storage of large objects in workflows that involve cloning and updating artifacts.