cdcgo is a Go library that implements FastCDC, a high-performance content-defined chunking (CDC) algorithm for deduplication and data storage.
Traditional fixed-size chunking wastes space when data shifts.
Content-defined chunking (CDC) finds natural data boundaries, enabling efficient deduplication — ideal for:
- Backups & snapshots
- Object storage (S3/MinIO)
- Large-file synchronization
- Idiomatic Go API for FastCDC
- Streaming support (
io.Reader/io.Writer) — scales to GB-size files - Chunk metadata (offset, size, SHA-256 hash)
- Benchmarks + tests
- Examples: split & reassemble files
- Dedupe helpers (chunk indexing)
- Storage backends: local FS + S3/MinIO
- CLI tool (
cdcbench) for benchmarking & stats - Configurable hash functions (SHA-1, SHA-256, BLAKE3)
go get github.com/AumSahayata/cdcgo- pkg.go.dev documentation
- Examples available in the
examples/folder (Soon!)
This project is licensed under the MIT License. See LICENSE for details.