Implementing Stores
This guide provides comprehensive instructions for implementing custom metadata stores, local block stores, and remote block stores for DittoFS. Whether you’re building a database-backed metadata store or a custom cloud storage integration, this document will walk you through the process with best practices and practical examples.
Table of Contents
Section titled “Table of Contents”- Overview
- When to Implement Custom Stores
- Understanding the Architecture
- Implementing Metadata Stores
- Implementing a Local Store
- Implementing a Remote Store
- Best Practices
- Testing Your Implementation
- Common Pitfalls
- Integration with DittoFS
Overview
Section titled “Overview”DittoFS uses a two-tier block store architecture with three distinct store types:
- Metadata Stores: Simple CRUD operations for file/directory structure, attributes, permissions
- Local Block Stores: Fast, per-share storage on local disk or in memory (L2 cache tier)
- Remote Block Stores: Durable storage in S3 or compatible object stores (L3 tier, shared across shares via ref counting)
Key Design Principle: Each share gets its own *engine.BlockStore instance that composes a local store, optional remote store, and syncer. The engine orchestrates reads and writes across the tiers. Local stores provide fast access; remote stores provide durability.
This separation enables:
- Independent scaling of metadata and block storage
- Per-share isolation with shared remote backends
- Different storage tiers (hot/cold storage, SSD/HDD)
- Simple store implementations (just implement the interface, the engine handles coordination)
When to Implement Custom Stores
Section titled “When to Implement Custom Stores”Metadata Store Use Cases
Section titled “Metadata Store Use Cases”Implement a custom metadata store when you need:
- Database-backed storage: PostgreSQL, MySQL, MongoDB, Cassandra
- Distributed metadata: Multi-node coordination, consensus protocols
- Advanced features: Full-text search, custom indexing, complex queries
- Compliance: Audit logs, versioning, immutability guarantees
Example: A PostgreSQL-backed metadata store for enterprise environments requiring audit trails and high availability.
Local Store Use Cases
Section titled “Local Store Use Cases”Implement a custom local store when you need:
- Specialized local storage: NVMe-optimized, hardware-accelerated compression
- Custom eviction: Access-pattern-aware eviction beyond simple LRU
- Encryption at rest: Hardware-accelerated encryption for local blocks
Reference implementation: pkg/blockstore/local/fs/ (filesystem-backed local store)
Remote Store Use Cases
Section titled “Remote Store Use Cases”Implement a custom remote store when you need:
- Cloud storage integration: Azure Blob, Google Cloud Storage, custom object stores
- Specialized storage: Tape archives, HSM systems, data lakes
- Tiering: Automatic hot/cold data movement based on access patterns
Reference implementation: pkg/blockstore/remote/s3/ (S3-backed remote store)
Understanding the Architecture
Section titled “Understanding the Architecture”Per-Share BlockStore
Section titled “Per-Share BlockStore”Each share gets its own *engine.BlockStore instance:
┌─────────────────────────────────────┐│ engine.BlockStore (per-share) ││ ││ ┌─────────────┐ ┌─────────────┐ ││ │ LocalStore │ │ RemoteStore │ ││ │ (required) │ │ (optional) │ ││ └──────┬──────┘ └──────┬──────┘ ││ │ │ ││ └───────┬────────┘ ││ │ ││ ┌──────▼──────┐ ││ │ Syncer │ ││ │ (async xfer)│ ││ └─────────────┘ │└─────────────────────────────────────┘- LocalStore is required — all reads and writes go through local storage first
- RemoteStore is optional — when configured, the Syncer asynchronously uploads local blocks to remote storage
- Ref counting: Remote stores are shared across shares; when the last share using a remote store is removed, the connection is closed
File Handle and Block Resolution
Section titled “File Handle and Block Resolution”Protocol handlers resolve the per-share block store via GetBlockStoreForHandle(ctx, handle):
- File handle encodes the share name
- Runtime extracts share name and returns the share’s BlockStore
- Handler calls
ReadAt/WriteAton the BlockStore
Implementing Metadata Stores
Section titled “Implementing Metadata Stores”The metadata store interface and implementation guide remains the same as before. See the pkg/metadata/Store interface and reference implementations:
pkg/metadata/store/memory/: In-memory (fast, ephemeral)pkg/metadata/store/badger/: BadgerDB (persistent, embedded)pkg/metadata/store/postgres/: PostgreSQL (persistent, distributed)
Conformance tests: pkg/metadata/storetest/
MetadataStore.EnumerateFileBlocks (v0.15.0 Phase 11; lifted from FileBlockStore in Phase 12)
Section titled “MetadataStore.EnumerateFileBlocks (v0.15.0 Phase 11; lifted from FileBlockStore in Phase 12)”v0.15.0 (Phase 11 / A2) added a mandatory cursor method that the mark-sweep garbage collector uses to enumerate every live block hash without loading the full file/block set into application memory.
Phase 12 / META-03 / D-08 note:
EnumerateFileBlocksnow lives onMetadataStore, notFileBlockStore. Conceptually it iterates across files for the GC mark phase — a metadata-store-wide concern — so it moved up the stack whenFileBlockStorewas narrowed to its 6-method spec surface (see FileBlockStore narrowing (v0.15.0 Phase 12) below). Backends that already implemented it on the FileBlockStore in Phase 11 need only re-attach the same code to theMetadataStoreinterface.
// EnumerateFileBlocks streams every FileBlock's ContentHash to fn.// Implementations MUST:// - Iterate using a backend-native cursor (Badger prefix iterator,// Postgres server-side cursor with batched fetch, in-memory map// iteration) -- no full-set load.// - Honor ctx.Done(): return ctx.Err() promptly when the context is// cancelled.// - Emit zero-hash FileBlocks (legacy pre-Phase-11 data) the same way// as non-zero-hash blocks; the GC live-set ignores zero hashes.// - Abort iteration and return the fn error verbatim if fn returns// non-nil; do NOT swallow it.// - Be safe under concurrent writes: it is acceptable for the cursor// to miss FileBlocks created mid-iteration; the next mark cycle// will pick them up.EnumerateFileBlocks(ctx context.Context, fn func(ContentHash) error) errorConformance scenarios live in pkg/metadata/storetest/ and every
backend MUST pass them:
- Empty store:
fnis never invoked; returnsnil. - Single file:
fnis invoked once per FileBlock for the file. - Large fanout (
Nfiles ×Mblocks):fnis invoked exactlyN*Mtimes in any order; no duplicates, no omissions. - fn-error mid-iteration: returning a non-nil error from
fnaborts iteration and propagates the error. - Context cancellation: cancelling
ctxmid-iteration causes the call to returnctx.Err()within the polling interval.
Memory-store reference: direct range over the in-memory map.
Badger-store reference: txn.NewIterator over the FileBlock prefix.
Postgres-store reference: server-side cursor (DECLARE + FETCH)
with batches of 1000 rows.
FileBlockStore narrowing (v0.15.0 Phase 12)
Section titled “FileBlockStore narrowing (v0.15.0 Phase 12)”v0.15.0 (Phase 12 / A3) narrows pkg/blockstore.FileBlockStore to
exactly 6 hash-keyed CRUD methods per the META-03 spec (D-09).
Backend implementations are simpler; engine-internal helpers move to a
separate wider interface.
// pkg/blockstore/store.go (Phase 12 spec-literal surface)type FileBlockStore interface { // GetByHash looks up the FileBlock with the given content hash. // Returns ErrFileBlockNotFound when no row matches. GetByHash(ctx context.Context, hash ContentHash) (*FileBlock, error)
// Put inserts or updates a FileBlock keyed by hash. Put(ctx context.Context, fb *FileBlock) error
// Delete removes the FileBlock with the given hash. Idempotent — // deleting a non-existent hash is not an error. Delete(ctx context.Context, hash ContentHash) error
// IncrementRefCount atomically bumps the FileBlock's RefCount by 1. IncrementRefCount(ctx context.Context, hash ContentHash) error
// DecrementRefCount atomically decrements RefCount; returns the new // value. Callers MUST NOT decrement below zero. DecrementRefCount(ctx context.Context, hash ContentHash) (int64, error)
// ListPending streams every Pending FileBlock to fn. Used by syncer. ListPending(ctx context.Context, fn func(*FileBlock) error) error}Engine-internal companion interface: pkg/blockstore.EngineFileBlockStore
extends FileBlockStore with GetFileBlock(ctx, id) and
ListFileBlocks(ctx, fn) for the engine’s hot paths. All three built-in
backends (memory, badger, postgres) satisfy it without changes — the
narrow public surface is a documentation concern, not a runtime
restriction. Custom backends implementing FileBlockStore SHOULD also
implement the engine-internal helpers if they intend to slot into the
*engine.BlockStore.
Internal storage shape is up to the backend. The Phase 11 schema
(id VARCHAR PRIMARY KEY + hash non-unique index) stays in place to
honor the WR-4-01 multiple-rows-per-hash contract for legacy data;
the public GetByHash surface hides that detail. Phase 15 will
collapse to hash-PK once Phase 14 migrates legacy data.
FileAttr.Blocks []BlockRef (v0.15.0 Phase 12)
Section titled “FileAttr.Blocks []BlockRef (v0.15.0 Phase 12)”Phase 12 reintroduces FileAttr.Blocks []blockstore.BlockRef as the
authoritative content list for every file (META-01 / META-04 / D-01..D-05).
BlockRef is the 3-tuple of (Hash, Offset, Size) — see
pkg/blockstore/types.go. The list MUST be sorted by Offset and is
populated on every sync finalization.
Encoding requirements per backend:
- Postgres: a separate
file_block_refsjoin table keyed by(file_id, offset), withINCLUDE (size, hash)for index-only scans on the read hot path. Foreign keyfile_id REFERENCES files(id) ON DELETE CASCADEprovides a safety net — the engine still decrementsfile_blocks.RefCountfor every BlockRef BEFORE deleting the file; cascade catches engine-bug paths that miss the explicit decrement. Hash column isBYTEA(32 bytes), not hexTEXT. Migration:pkg/metadata/store/postgres/migrations/000012_file_block_refs.up.sql. - Badger and Memory: inline-encode
Blocks []BlockRefinside the existingFileAttrblob. Badger goes throughpkg/metadata/store/badger/encoding.go(gob); Memory holds typed structs directly. Useomitemptyso legacy pre-Phase-12 blobs decode cleanly with an emptyBlocksslice (D-06 dual-read trigger).
A new metadata-store method persists the list; in the built-in
backends this is MetadataStore.SetFileBlocks(ctx, handle, []BlockRef, authCtx) error. Custom metadata backends MUST persist atomically
with the same transaction that updates Size/Mtime/Ctime — the
engine relies on caller-side metadata-txn isolation rather than a
per-chunk metadata roundtrip (D-22 caller-snapshot semantics).
Conformance scenarios
Section titled “Conformance scenarios”The pkg/metadata/storetest/ suite extends the Phase 11 tests with:
- BlockRef round-trip:
SetFileBlocksfollowed byGetFileAttrreturns the same offset-sorted slice, byte-for-byte. - Empty / legacy compat:
FileAttrblobs without aBlocksfield decode to an empty slice without errors. - FK cascade (Postgres-only): deleting a file removes all
matching
file_block_refsrows. - INV-02 reconcile:
∑ FileBlock.RefCountover the FileBlockStore equals∑ len(FileAttr.Blocks)over the MetadataStore at every quiescent point. - INV-02 concurrent fuzz (
pkg/metadata/storetest/inv02_fuzz_test.go): 100-iteration property-based fuzzer creating, deleting, and copying files concurrently; asserts the invariant after each operation batch. Runs against all 3 built-in backends and any custom backend wired into the conformance harness.
FileAttr.ObjectID + FindByObjectID (v0.15.0 Phase 13)
Section titled “FileAttr.ObjectID + FindByObjectID (v0.15.0 Phase 13)”Phase 13 adds FileAttr.ObjectID — a BLAKE3 Merkle root over
BlockRef.Hash values sorted by Offset, prefixed by the
domain-separation tag dittofs:objectid:v1\x00. Computed by
blockstore.ComputeObjectID and persisted at every full quiesce in the
same metadata transaction that updates Blocks/Size/Mtime (META-02
/ BSCAS-04 / D-05..D-07).
Lifecycle: cleared (zeroed) on first dirty write that mutates Blocks,
recomputed at next full quiesce (every block transitioned to Remote).
Partial flushes leave ObjectID at zero so the lookup index never
returns a half-quiesced file.
FindByObjectID(ctx, ObjectID) ([]BlockRef, error)
Section titled “FindByObjectID(ctx, ObjectID) ([]BlockRef, error)”The Phase 13 BSCAS-05 short-circuit primitive. Looks up a file by its
Merkle-root ObjectID. Returns (nil, nil) on miss; non-nil result
carries the canonical BlockRef list of the matching file (per-metadata-
store scope, NOT per-share — D-13).
Backends MUST maintain a secondary index from ObjectID to file row:
| Backend | Index |
|---|---|
| Postgres | Partial unique: files_object_id_idx ON files(object_id) WHERE object_id IS NOT NULL (migration 000013_object_id.up.sql) |
| Badger | Secondary key obj:{hex} -> file_id, maintained inside each Put/Delete write batch |
| Memory | map[ContentHash]uuid, guarded by the existing store mutex |
Zero-valued ObjectID (legacy / pre-quiesce) MUST NOT match any row —
implementations short-circuit and return (nil, nil) on zero input.
The unique constraint enforces concurrent-quiesce conflict (D-14
first-committer-wins). On race, the loser surfaces a backend-specific
unique-violation error that the runtime coordinator wraps into the
shared metadata.ErrConflict sentinel.
A test-only optional capability ObjectIDIndexAccessor.CountObjectIDIndexRows
is exercised by the storetest ConcurrentQuiesceRace scenario;
backends implement it inline (e.g., SELECT count(*) for Postgres,
txn.Get(keyObjectID(oid))-shape for Badger, direct map probe for
Memory). Production code MUST NOT call it.
Conformance scenarios live in
pkg/metadata/storetest/objectid_roundtrip.go and
pkg/metadata/storetest/objectid_lookup.go. All built-in backends pass
without per-backend t.Skip (the ObjectIDIndexAccessor capability is
the only legitimate type-assertion-skip; backends without that accessor
are still required to pass the functional scenarios).
Block layout flag (v0.15+)
Section titled “Block layout flag (v0.15+)”Metadata backends MUST persist a block_layout field on the share
record. The field is a metadata.BlockLayout enum (string-shaped on
the wire) with values legacy or cas-only. Plan 14-01 introduced
the field as part of the v0.15.0 milestone (Phase 14 / A5;
#425).
type BlockLayout uint8
const ( BlockLayoutLegacy BlockLayout = iota // dual-read: legacy + CAS coexist BlockLayoutCASOnly // CAS-only: legacy reads fail loud)
type ShareOptions struct { // ... pre-existing fields ... BlockLayout BlockLayout}Forward-compat rule: Empty / missing values MUST coerce to
BlockLayoutLegacy on read so pre-Phase-14 metadata rows decode
cleanly. Use the metadata.ParseBlockLayout("") helper, which
returns BlockLayoutLegacy. Unknown values (anything other than
legacy / cas-only) MUST surface metadata.ErrInvalidBlockLayout
rather than coercing — silent coercion would mask bugs in upstream
backends.
Round-trip invariant: UpdateShareOptions(BlockLayout=cas-only)
followed by GetShare(name) MUST observe cas-only. The migration
tool’s cutover (D-A7) depends on this txn being durable AND visible
to the engine’s next share-open.
The conformance suite scenario RunBlockLayoutSuite exercises
round-trip and update semantics. New backends MUST invoke it from
their per-backend test file:
import "github.com/marmos91/dittofs/pkg/metadata/storetest"
func TestBlockLayoutConformance(t *testing.T) { storetest.RunBlockLayoutSuite(t, factoryFunc)}The suite asserts:
- Default is
legacyfor newly created shares (pre-existing shares upgraded into v0.15+ default tolegacy). - Round-trip of
UpdateShareOptions(BlockLayout=cas-only)→GetShare()returnscas-only. - Empty-string coercion at the parsing boundary
(
ParseBlockLayout("") == BlockLayoutLegacy). - Unknown-value rejection via
ErrInvalidBlockLayout. - Atomic update semantics — concurrent updates of the same
share’s
BlockLayoutserialize through the backend’s existing share-update path (D-A7 piggybacks on this).
Recommended persistence shape per backend:
| Backend | Recommended layout |
|---|---|
| Postgres | Dedicated block_layout TEXT NOT NULL DEFAULT 'legacy' column on shares (migration 000014_block_layout.up.sql is the reference). Authoritative over the legacy options JSON blob. |
| Badger | Inline-encoded inside the existing ShareOptions blob (gob; omitempty on the new field for forward-compat with pre-Phase-14 rows). |
| Memory | Direct field on the in-process struct; no persistence layer. |
Cross-references:
- ARCHITECTURE.md — Migration & Block-Layout Routing for how the engine consumes the flag.
- BLOCKSTORE_MIGRATION.md — Phase 14 runbook for the operator-facing migration story.
Engine API surface (Phase 12 / API-01..04)
Section titled “Engine API surface (Phase 12 / API-01..04)”Custom block-store implementations that compose into *engine.BlockStore
do not see the engine API directly — that surface is consumed by
adapters via internal/adapter/common/. For reference, the Phase 12
signatures are:
ReadAt(ctx, payloadID, blocks []BlockRef, dest []byte, offset uint64) (int, error)WriteAt(ctx, payloadID, currentBlocks []BlockRef, data []byte, offset uint64) ([]BlockRef, error)Truncate(ctx, payloadID, currentBlocks []BlockRef, newSize uint64) ([]BlockRef, error)Delete(ctx, payloadID, blocks []BlockRef) errorCopyPayload(ctx, srcPayloadID, srcBlocks []BlockRef, dstPayloadID) ([]BlockRef, error)nil or empty blocks triggers the Phase 11 dual-read shim (D-20)
for legacy files that have no populated FileAttr.Blocks yet.
Non-empty blocks is the CAS path with end-to-end BLAKE3 verification
(INV-06).
Implementing a Local Store
Section titled “Implementing a Local Store”Local stores provide fast, per-share block storage. Each share gets an isolated local storage directory.
The LocalStore Interface
Section titled “The LocalStore Interface”The pkg/blockstore/local.LocalStore interface defines the contract:
type LocalStore interface { // ReadAt reads block data at the given offset ReadAt(ctx context.Context, blockID string, p []byte, offset int64) (int, error)
// WriteAt writes block data at the given offset WriteAt(ctx context.Context, blockID string, data []byte, offset int64) (int, error)
// Delete removes a block from local storage Delete(ctx context.Context, blockID string) error
// Exists checks if a block exists locally Exists(ctx context.Context, blockID string) (bool, error)
// Flush ensures all pending writes are persisted Flush(ctx context.Context, blockID string) error
// List returns all block IDs in local storage List(ctx context.Context) ([]string, error)
// Close releases resources Close() error}Implementation Pattern
Section titled “Implementation Pattern”package mylocal
import ( "context")
type MyLocalStore struct { basePath string // Your backend (filesystem, NVMe, etc.)}
func New(basePath string) (*MyLocalStore, error) { // Initialize your storage backend return &MyLocalStore{basePath: basePath}, nil}
func (s *MyLocalStore) ReadAt(ctx context.Context, blockID string, p []byte, offset int64) (int, error) { // Read block data from your backend // Return bytes read and any error}
func (s *MyLocalStore) WriteAt(ctx context.Context, blockID string, data []byte, offset int64) (int, error) { // Write block data to your backend // Return bytes written and any error}
// ... implement remaining interface methodsReference Implementation
Section titled “Reference Implementation”See pkg/blockstore/local/fs/ for a complete filesystem-backed local store implementation that handles:
- Per-share isolated directories
- Atomic writes
- Block listing for sync operations
Phase 10 additions (flag-gated, experimental)
Section titled “Phase 10 additions (flag-gated, experimental)”v0.15.0 Phase 10 adds a hybrid append-log + content-addressed (CAS) chunk
tier inside *fs.FSStore, gated by the use_append_log config flag
(defaults to false; see docs/CONFIGURATION.md). New methods on
*fs.FSStore (NOT yet on the LocalStore interface): AppendWrite,
StoreChunk, ReadChunk, HasChunk, DeleteChunk, DeleteAppendLog,
TruncateAppendLog, StartRollup. The LocalStore interface is
deliberately unchanged in v0.15.0 Phase 10 — LSL-07 narrows it in Phase
11 (A2). Existing LocalStore implementations in v0.15.0 Phase 10 do NOT
need to change to stay compatible.
A new per-file metadata surface metadata.RollupStore (two methods:
SetRollupOffset, GetRollupOffset) is required only when
use_append_log=true. The built-in memory, Badger, and Postgres backends
all implement it. New metadata backends targeting the hybrid tier must add
equivalent persistence keyed by payloadID, backed by an atomic upsert
(metadata is source of truth for the log’s rollup_offset; see
docs/ARCHITECTURE.md INV-03). Backends that stay on the legacy write
path need not implement RollupStore.
Experimental: Do not enable use_append_log in production before
v0.15.0 Phase 11 lands — the blocks/ directory grows unbounded without
Phase 11’s mark-sweep GC.
Conformance Tests
Section titled “Conformance Tests”Test your local store with the conformance suite:
package mylocal_test
import ( "testing" "github.com/marmos91/dittofs/pkg/blockstore/local/localtest")
func TestMyLocalStore(t *testing.T) { store, cleanup := createTestStore(t) defer cleanup() localtest.RunLocalStoreTests(t, store)}Implementing a Remote Store
Section titled “Implementing a Remote Store”Remote stores provide durable block storage shared across shares via ref counting.
The RemoteStore Interface
Section titled “The RemoteStore Interface”The pkg/blockstore/remote.RemoteStore interface defines the contract:
type RemoteStore interface { // ReadBlock reads an entire block from remote storage ReadBlock(ctx context.Context, blockID string) ([]byte, error)
// WriteBlock writes an entire block to remote storage WriteBlock(ctx context.Context, blockID string, data []byte) error
// DeleteBlock removes a block from remote storage DeleteBlock(ctx context.Context, blockID string) error
// HealthCheck verifies the remote store is accessible HealthCheck(ctx context.Context) error
// Close releases resources Close() error}Implementation Pattern
Section titled “Implementation Pattern”package myremote
import ( "context")
type MyRemoteStore struct { client *MyCloudClient bucket string}
func New(config Config) (*MyRemoteStore, error) { client, err := connectToCloud(config) if err != nil { return nil, err } return &MyRemoteStore{client: client, bucket: config.Bucket}, nil}
func (s *MyRemoteStore) ReadBlock(ctx context.Context, blockID string) ([]byte, error) { // Fetch block from cloud storage return s.client.GetObject(ctx, s.bucket, blockID)}
func (s *MyRemoteStore) WriteBlock(ctx context.Context, blockID string, data []byte) error { // Upload block to cloud storage return s.client.PutObject(ctx, s.bucket, blockID, data)}
func (s *MyRemoteStore) DeleteBlock(ctx context.Context, blockID string) error { // Remove block from cloud storage (idempotent) err := s.client.DeleteObject(ctx, s.bucket, blockID) if err != nil && !isNotFoundError(err) { return err } return nil}
func (s *MyRemoteStore) HealthCheck(ctx context.Context) error { // Verify connectivity (e.g., HEAD bucket) return s.client.HeadBucket(ctx, s.bucket)}
func (s *MyRemoteStore) Close() error { return s.client.Close()}Ref Counting
Section titled “Ref Counting”Remote stores are shared across shares via ref counting:
- When a share is created referencing a remote store, the ref count increments
- When a share is removed, the ref count decrements
- When the ref count reaches zero,
Close()is called
This means your Close() implementation should release all resources (connections, goroutines, etc.).
Reference Implementation
Section titled “Reference Implementation”See pkg/blockstore/remote/s3/ for a production S3 remote store implementation with:
- Configurable retry with exponential backoff
- Health check via HEAD bucket
- Efficient multipart uploads for large blocks
CAS contracts (v0.15.0 Phase 11)
Section titled “CAS contracts (v0.15.0 Phase 11)”v0.15.0 (Phase 11 / A2) routes all new uploads through a
content-addressable keyspace cas/{hh}/{hh}/{hex} and verifies every
byte downloaded from the remote against the expected BLAKE3 hash. Two
new contract methods are required for any RemoteStore implementation
that wants to participate in the v0.15.0 write path; backends remaining
on the legacy {payloadID}/block-{N} keyspace continue to work via
the dual-read shim until Phase 14 (A5).
RemoteStore.WriteBlockWithHash
Section titled “RemoteStore.WriteBlockWithHash”// WriteBlockWithHash uploads data under the CAS key derived from h// and sets a backend-native object-metadata header carrying the hash.//// Semantics:// - The key MUST be derived from h via FormatCASKey (cas/{hh}/{hh}/{hex}).// - The backend-native object metadata MUST set "content-hash" to// "blake3:" + hex(h). For S3, this becomes the user-metadata header// x-amz-meta-content-hash. For other backends, set the equivalent// custom-metadata field.// - The PUT MUST be atomic: either the object exists at the CAS key// with the correct bytes AND the metadata header, or it does not// exist at all.// - The call MUST be idempotent: re-uploading the same h with the// same bytes is a no-op (or an overwrite that yields identical// state). This is what makes the syncer's restart-recovery janitor// safe.// - Errors are returned as typed values mapped through// internal/adapter/common/.WriteBlockWithHash(ctx context.Context, blockKey string, hash ContentHash, data []byte) errorExternal tooling (e.g. aws s3api head-object) MUST be able to verify
the header without DittoFS metadata — this is the BSCAS-06 external
verifier criterion.
RemoteStore.ReadBlockVerified
Section titled “RemoteStore.ReadBlockVerified”// ReadBlockVerified reads the object at the CAS key derived from h// and verifies its bytes against h end-to-end before returning them// to the caller.//// Semantics:// - HEAD-style pre-check: if the backend exposes the content-hash// header cheaply (S3 GetObject returns it in the same response,// so no extra round-trip is needed), reject early with// ErrCASContentMismatch when the header does not match h.// - Streaming verification: the body is fed to a blake3.Hasher as// the caller reads it. On EOF, hasher.Sum(nil) MUST equal h or// the call returns ErrCASContentMismatch and the buffer is// discarded -- corrupt bytes MUST NOT be surfaced upstream.// - The streaming verifier sees bytes once (zero extra allocation).// - Verification is hard-required (INV-06): there is no opt-out// knob.ReadBlockVerified(ctx context.Context, blockKey string, expected ContentHash) ([]byte, error)Header pre-check + streaming recompute is “fail-closed twice” by design: the header alone is not sufficient (would trust the backend to never silently corrupt); recompute alone wastes a body read on a definitively-wrong object.
Conformance Tests
Section titled “Conformance Tests”Test your remote store with the conformance suite:
package myremote_test
import ( "testing" "github.com/marmos91/dittofs/pkg/blockstore/remote/remotetest")
func TestMyRemoteStore(t *testing.T) { store, cleanup := createTestStore(t) defer cleanup() remotetest.RunRemoteStoreTests(t, store)}The v0.15.0 conformance suite extends remotetest with scenarios for
WriteBlockWithHash (header is set; key shape matches cas/...;
re-PUT is idempotent) and ReadBlockVerified (round-trip succeeds;
header-mismatch returns ErrCASContentMismatch; body-mismatch returns
ErrCASContentMismatch; corrupt bytes never surface upstream).
Best Practices
Section titled “Best Practices”Thread Safety
Section titled “Thread Safety”All store implementations must be thread-safe. Multiple goroutines will access the store concurrently.
Context Handling
Section titled “Context Handling”Always respect context cancellation, especially for remote stores where network calls can be slow:
func (s *MyStore) ReadBlock(ctx context.Context, blockID string) ([]byte, error) { if err := ctx.Err(); err != nil { return nil, err } // Proceed with operation}Error Handling
Section titled “Error Handling”- Local store errors should be wrapped with meaningful context
- Remote store errors should distinguish transient (retry-able) from permanent failures
- Delete operations should be idempotent (deleting a non-existent block is not an error)
Performance
Section titled “Performance”- Local stores: Minimize syscalls, use buffered I/O, consider memory-mapped files
- Remote stores: Use connection pooling, implement retry with backoff, batch operations where possible
Testing Your Implementation
Section titled “Testing Your Implementation”- Conformance tests: Run the provided test suites (
localtest/remotetest) - Concurrency tests: Verify thread safety with parallel reads/writes
- Error handling tests: Test behavior with canceled contexts, network failures
- Integration tests: Test with the full DittoFS stack (create share, mount, read/write)
Common Pitfalls
Section titled “Common Pitfalls”- Not making Delete idempotent: Deleting a non-existent block should succeed
- Ignoring context cancellation: Long operations should check
ctx.Err()periodically - Unsafe concurrent access: Use proper synchronization for shared state
- Resource leaks: Ensure
Close()releases all resources (connections, goroutines, file handles)
Integration with DittoFS
Section titled “Integration with DittoFS”Register Your Store
Section titled “Register Your Store”Add your store type to the configuration system:
func createLocalStore(config LocalStoreConfig) (local.LocalStore, error) { switch config.Type { case "fs": return fs.New(config.Path) case "memory": return memory.New() case "mylocal": return mylocal.New(config.Path) default: return nil, fmt.Errorf("unknown local store type: %s", config.Type) }}CLI Integration
Section titled “CLI Integration”Users can then create your store via CLI:
./dfsctl store block add --kind local --name my-store --type mylocal \ --config '{"path":"/data/blocks"}'Additional Resources
Section titled “Additional Resources”- Interface Definitions:
pkg/blockstore/local/local.go,pkg/blockstore/remote/remote.go - Reference Implementations:
- Local:
pkg/blockstore/local/fs/,pkg/blockstore/local/memory/ - Remote:
pkg/blockstore/remote/s3/,pkg/blockstore/remote/memory/ - Metadata:
pkg/metadata/store/memory/,pkg/metadata/store/badger/,pkg/metadata/store/postgres/
- Local:
- Conformance Tests:
pkg/blockstore/local/localtest/,pkg/blockstore/remote/remotetest/,pkg/metadata/storetest/ - Architecture:
docs/ARCHITECTURE.md - Configuration:
docs/CONFIGURATION.md - Contributing:
docs/CONTRIBUTING.md