Skip to content

Implementing Stores

This guide provides comprehensive instructions for implementing custom metadata stores, local block stores, and remote block stores for DittoFS. Whether you’re building a database-backed metadata store or a custom cloud storage integration, this document will walk you through the process with best practices and practical examples.

  1. Overview
  2. When to Implement Custom Stores
  3. Understanding the Architecture
  4. Implementing Metadata Stores
  5. Implementing a Local Store
  6. Implementing a Remote Store
  7. Best Practices
  8. Testing Your Implementation
  9. Common Pitfalls
  10. Integration with DittoFS

DittoFS uses a two-tier block store architecture with three distinct store types:

  • Metadata Stores: Simple CRUD operations for file/directory structure, attributes, permissions
  • Local Block Stores: Fast, per-share storage on local disk or in memory (L2 cache tier)
  • Remote Block Stores: Durable storage in S3 or compatible object stores (L3 tier, shared across shares via ref counting)

Key Design Principle: Each share gets its own *engine.BlockStore instance that composes a local store, optional remote store, and syncer. The engine orchestrates reads and writes across the tiers. Local stores provide fast access; remote stores provide durability.

This separation enables:

  • Independent scaling of metadata and block storage
  • Per-share isolation with shared remote backends
  • Different storage tiers (hot/cold storage, SSD/HDD)
  • Simple store implementations (just implement the interface, the engine handles coordination)

Implement a custom metadata store when you need:

  • Database-backed storage: PostgreSQL, MySQL, MongoDB, Cassandra
  • Distributed metadata: Multi-node coordination, consensus protocols
  • Advanced features: Full-text search, custom indexing, complex queries
  • Compliance: Audit logs, versioning, immutability guarantees

Example: A PostgreSQL-backed metadata store for enterprise environments requiring audit trails and high availability.

Implement a custom local store when you need:

  • Specialized local storage: NVMe-optimized, hardware-accelerated compression
  • Custom eviction: Access-pattern-aware eviction beyond simple LRU
  • Encryption at rest: Hardware-accelerated encryption for local blocks

Reference implementation: pkg/blockstore/local/fs/ (filesystem-backed local store)

Implement a custom remote store when you need:

  • Cloud storage integration: Azure Blob, Google Cloud Storage, custom object stores
  • Specialized storage: Tape archives, HSM systems, data lakes
  • Tiering: Automatic hot/cold data movement based on access patterns

Reference implementation: pkg/blockstore/remote/s3/ (S3-backed remote store)

Each share gets its own *engine.BlockStore instance:

┌─────────────────────────────────────┐
│ engine.BlockStore (per-share) │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ LocalStore │ │ RemoteStore │ │
│ │ (required) │ │ (optional) │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │
│ └───────┬────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ Syncer │ │
│ │ (async xfer)│ │
│ └─────────────┘ │
└─────────────────────────────────────┘
  • LocalStore is required — all reads and writes go through local storage first
  • RemoteStore is optional — when configured, the Syncer asynchronously uploads local blocks to remote storage
  • Ref counting: Remote stores are shared across shares; when the last share using a remote store is removed, the connection is closed

Protocol handlers resolve the per-share block store via GetBlockStoreForHandle(ctx, handle):

  1. File handle encodes the share name
  2. Runtime extracts share name and returns the share’s BlockStore
  3. Handler calls ReadAt / WriteAt on the BlockStore

The metadata store interface and implementation guide remains the same as before. See the pkg/metadata/Store interface and reference implementations:

  • pkg/metadata/store/memory/: In-memory (fast, ephemeral)
  • pkg/metadata/store/badger/: BadgerDB (persistent, embedded)
  • pkg/metadata/store/postgres/: PostgreSQL (persistent, distributed)

Conformance tests: pkg/metadata/storetest/

MetadataStore.EnumerateFileBlocks (v0.15.0 Phase 11; lifted from FileBlockStore in Phase 12)

Section titled “MetadataStore.EnumerateFileBlocks (v0.15.0 Phase 11; lifted from FileBlockStore in Phase 12)”

v0.15.0 (Phase 11 / A2) added a mandatory cursor method that the mark-sweep garbage collector uses to enumerate every live block hash without loading the full file/block set into application memory.

Phase 12 / META-03 / D-08 note: EnumerateFileBlocks now lives on MetadataStore, not FileBlockStore. Conceptually it iterates across files for the GC mark phase — a metadata-store-wide concern — so it moved up the stack when FileBlockStore was narrowed to its 6-method spec surface (see FileBlockStore narrowing (v0.15.0 Phase 12) below). Backends that already implemented it on the FileBlockStore in Phase 11 need only re-attach the same code to the MetadataStore interface.

// EnumerateFileBlocks streams every FileBlock's ContentHash to fn.
// Implementations MUST:
// - Iterate using a backend-native cursor (Badger prefix iterator,
// Postgres server-side cursor with batched fetch, in-memory map
// iteration) -- no full-set load.
// - Honor ctx.Done(): return ctx.Err() promptly when the context is
// cancelled.
// - Emit zero-hash FileBlocks (legacy pre-Phase-11 data) the same way
// as non-zero-hash blocks; the GC live-set ignores zero hashes.
// - Abort iteration and return the fn error verbatim if fn returns
// non-nil; do NOT swallow it.
// - Be safe under concurrent writes: it is acceptable for the cursor
// to miss FileBlocks created mid-iteration; the next mark cycle
// will pick them up.
EnumerateFileBlocks(ctx context.Context, fn func(ContentHash) error) error

Conformance scenarios live in pkg/metadata/storetest/ and every backend MUST pass them:

  1. Empty store: fn is never invoked; returns nil.
  2. Single file: fn is invoked once per FileBlock for the file.
  3. Large fanout (N files × M blocks): fn is invoked exactly N*M times in any order; no duplicates, no omissions.
  4. fn-error mid-iteration: returning a non-nil error from fn aborts iteration and propagates the error.
  5. Context cancellation: cancelling ctx mid-iteration causes the call to return ctx.Err() within the polling interval.

Memory-store reference: direct range over the in-memory map. Badger-store reference: txn.NewIterator over the FileBlock prefix. Postgres-store reference: server-side cursor (DECLARE + FETCH) with batches of 1000 rows.

FileBlockStore narrowing (v0.15.0 Phase 12)

Section titled “FileBlockStore narrowing (v0.15.0 Phase 12)”

v0.15.0 (Phase 12 / A3) narrows pkg/blockstore.FileBlockStore to exactly 6 hash-keyed CRUD methods per the META-03 spec (D-09). Backend implementations are simpler; engine-internal helpers move to a separate wider interface.

// pkg/blockstore/store.go (Phase 12 spec-literal surface)
type FileBlockStore interface {
// GetByHash looks up the FileBlock with the given content hash.
// Returns ErrFileBlockNotFound when no row matches.
GetByHash(ctx context.Context, hash ContentHash) (*FileBlock, error)
// Put inserts or updates a FileBlock keyed by hash.
Put(ctx context.Context, fb *FileBlock) error
// Delete removes the FileBlock with the given hash. Idempotent —
// deleting a non-existent hash is not an error.
Delete(ctx context.Context, hash ContentHash) error
// IncrementRefCount atomically bumps the FileBlock's RefCount by 1.
IncrementRefCount(ctx context.Context, hash ContentHash) error
// DecrementRefCount atomically decrements RefCount; returns the new
// value. Callers MUST NOT decrement below zero.
DecrementRefCount(ctx context.Context, hash ContentHash) (int64, error)
// ListPending streams every Pending FileBlock to fn. Used by syncer.
ListPending(ctx context.Context, fn func(*FileBlock) error) error
}

Engine-internal companion interface: pkg/blockstore.EngineFileBlockStore extends FileBlockStore with GetFileBlock(ctx, id) and ListFileBlocks(ctx, fn) for the engine’s hot paths. All three built-in backends (memory, badger, postgres) satisfy it without changes — the narrow public surface is a documentation concern, not a runtime restriction. Custom backends implementing FileBlockStore SHOULD also implement the engine-internal helpers if they intend to slot into the *engine.BlockStore.

Internal storage shape is up to the backend. The Phase 11 schema (id VARCHAR PRIMARY KEY + hash non-unique index) stays in place to honor the WR-4-01 multiple-rows-per-hash contract for legacy data; the public GetByHash surface hides that detail. Phase 15 will collapse to hash-PK once Phase 14 migrates legacy data.

FileAttr.Blocks []BlockRef (v0.15.0 Phase 12)

Section titled “FileAttr.Blocks []BlockRef (v0.15.0 Phase 12)”

Phase 12 reintroduces FileAttr.Blocks []blockstore.BlockRef as the authoritative content list for every file (META-01 / META-04 / D-01..D-05). BlockRef is the 3-tuple of (Hash, Offset, Size) — see pkg/blockstore/types.go. The list MUST be sorted by Offset and is populated on every sync finalization.

Encoding requirements per backend:

  • Postgres: a separate file_block_refs join table keyed by (file_id, offset), with INCLUDE (size, hash) for index-only scans on the read hot path. Foreign key file_id REFERENCES files(id) ON DELETE CASCADE provides a safety net — the engine still decrements file_blocks.RefCount for every BlockRef BEFORE deleting the file; cascade catches engine-bug paths that miss the explicit decrement. Hash column is BYTEA (32 bytes), not hex TEXT. Migration: pkg/metadata/store/postgres/migrations/000012_file_block_refs.up.sql.
  • Badger and Memory: inline-encode Blocks []BlockRef inside the existing FileAttr blob. Badger goes through pkg/metadata/store/badger/encoding.go (gob); Memory holds typed structs directly. Use omitempty so legacy pre-Phase-12 blobs decode cleanly with an empty Blocks slice (D-06 dual-read trigger).

A new metadata-store method persists the list; in the built-in backends this is MetadataStore.SetFileBlocks(ctx, handle, []BlockRef, authCtx) error. Custom metadata backends MUST persist atomically with the same transaction that updates Size/Mtime/Ctime — the engine relies on caller-side metadata-txn isolation rather than a per-chunk metadata roundtrip (D-22 caller-snapshot semantics).

The pkg/metadata/storetest/ suite extends the Phase 11 tests with:

  1. BlockRef round-trip: SetFileBlocks followed by GetFileAttr returns the same offset-sorted slice, byte-for-byte.
  2. Empty / legacy compat: FileAttr blobs without a Blocks field decode to an empty slice without errors.
  3. FK cascade (Postgres-only): deleting a file removes all matching file_block_refs rows.
  4. INV-02 reconcile: ∑ FileBlock.RefCount over the FileBlockStore equals ∑ len(FileAttr.Blocks) over the MetadataStore at every quiescent point.
  5. INV-02 concurrent fuzz (pkg/metadata/storetest/inv02_fuzz_test.go): 100-iteration property-based fuzzer creating, deleting, and copying files concurrently; asserts the invariant after each operation batch. Runs against all 3 built-in backends and any custom backend wired into the conformance harness.

FileAttr.ObjectID + FindByObjectID (v0.15.0 Phase 13)

Section titled “FileAttr.ObjectID + FindByObjectID (v0.15.0 Phase 13)”

Phase 13 adds FileAttr.ObjectID — a BLAKE3 Merkle root over BlockRef.Hash values sorted by Offset, prefixed by the domain-separation tag dittofs:objectid:v1\x00. Computed by blockstore.ComputeObjectID and persisted at every full quiesce in the same metadata transaction that updates Blocks/Size/Mtime (META-02 / BSCAS-04 / D-05..D-07).

Lifecycle: cleared (zeroed) on first dirty write that mutates Blocks, recomputed at next full quiesce (every block transitioned to Remote). Partial flushes leave ObjectID at zero so the lookup index never returns a half-quiesced file.

FindByObjectID(ctx, ObjectID) ([]BlockRef, error)

Section titled “FindByObjectID(ctx, ObjectID) ([]BlockRef, error)”

The Phase 13 BSCAS-05 short-circuit primitive. Looks up a file by its Merkle-root ObjectID. Returns (nil, nil) on miss; non-nil result carries the canonical BlockRef list of the matching file (per-metadata- store scope, NOT per-share — D-13).

Backends MUST maintain a secondary index from ObjectID to file row:

BackendIndex
PostgresPartial unique: files_object_id_idx ON files(object_id) WHERE object_id IS NOT NULL (migration 000013_object_id.up.sql)
BadgerSecondary key obj:{hex} -> file_id, maintained inside each Put/Delete write batch
Memorymap[ContentHash]uuid, guarded by the existing store mutex

Zero-valued ObjectID (legacy / pre-quiesce) MUST NOT match any row — implementations short-circuit and return (nil, nil) on zero input.

The unique constraint enforces concurrent-quiesce conflict (D-14 first-committer-wins). On race, the loser surfaces a backend-specific unique-violation error that the runtime coordinator wraps into the shared metadata.ErrConflict sentinel.

A test-only optional capability ObjectIDIndexAccessor.CountObjectIDIndexRows is exercised by the storetest ConcurrentQuiesceRace scenario; backends implement it inline (e.g., SELECT count(*) for Postgres, txn.Get(keyObjectID(oid))-shape for Badger, direct map probe for Memory). Production code MUST NOT call it.

Conformance scenarios live in pkg/metadata/storetest/objectid_roundtrip.go and pkg/metadata/storetest/objectid_lookup.go. All built-in backends pass without per-backend t.Skip (the ObjectIDIndexAccessor capability is the only legitimate type-assertion-skip; backends without that accessor are still required to pass the functional scenarios).

Metadata backends MUST persist a block_layout field on the share record. The field is a metadata.BlockLayout enum (string-shaped on the wire) with values legacy or cas-only. Plan 14-01 introduced the field as part of the v0.15.0 milestone (Phase 14 / A5; #425).

pkg/metadata/types.go
type BlockLayout uint8
const (
BlockLayoutLegacy BlockLayout = iota // dual-read: legacy + CAS coexist
BlockLayoutCASOnly // CAS-only: legacy reads fail loud
)
type ShareOptions struct {
// ... pre-existing fields ...
BlockLayout BlockLayout
}

Forward-compat rule: Empty / missing values MUST coerce to BlockLayoutLegacy on read so pre-Phase-14 metadata rows decode cleanly. Use the metadata.ParseBlockLayout("") helper, which returns BlockLayoutLegacy. Unknown values (anything other than legacy / cas-only) MUST surface metadata.ErrInvalidBlockLayout rather than coercing — silent coercion would mask bugs in upstream backends.

Round-trip invariant: UpdateShareOptions(BlockLayout=cas-only) followed by GetShare(name) MUST observe cas-only. The migration tool’s cutover (D-A7) depends on this txn being durable AND visible to the engine’s next share-open.

The conformance suite scenario RunBlockLayoutSuite exercises round-trip and update semantics. New backends MUST invoke it from their per-backend test file:

import "github.com/marmos91/dittofs/pkg/metadata/storetest"
func TestBlockLayoutConformance(t *testing.T) {
storetest.RunBlockLayoutSuite(t, factoryFunc)
}

The suite asserts:

  1. Default is legacy for newly created shares (pre-existing shares upgraded into v0.15+ default to legacy).
  2. Round-trip of UpdateShareOptions(BlockLayout=cas-only)GetShare() returns cas-only.
  3. Empty-string coercion at the parsing boundary (ParseBlockLayout("") == BlockLayoutLegacy).
  4. Unknown-value rejection via ErrInvalidBlockLayout.
  5. Atomic update semantics — concurrent updates of the same share’s BlockLayout serialize through the backend’s existing share-update path (D-A7 piggybacks on this).

Recommended persistence shape per backend:

BackendRecommended layout
PostgresDedicated block_layout TEXT NOT NULL DEFAULT 'legacy' column on shares (migration 000014_block_layout.up.sql is the reference). Authoritative over the legacy options JSON blob.
BadgerInline-encoded inside the existing ShareOptions blob (gob; omitempty on the new field for forward-compat with pre-Phase-14 rows).
MemoryDirect field on the in-process struct; no persistence layer.

Cross-references:

Engine API surface (Phase 12 / API-01..04)

Section titled “Engine API surface (Phase 12 / API-01..04)”

Custom block-store implementations that compose into *engine.BlockStore do not see the engine API directly — that surface is consumed by adapters via internal/adapter/common/. For reference, the Phase 12 signatures are:

ReadAt(ctx, payloadID, blocks []BlockRef, dest []byte, offset uint64) (int, error)
WriteAt(ctx, payloadID, currentBlocks []BlockRef, data []byte, offset uint64) ([]BlockRef, error)
Truncate(ctx, payloadID, currentBlocks []BlockRef, newSize uint64) ([]BlockRef, error)
Delete(ctx, payloadID, blocks []BlockRef) error
CopyPayload(ctx, srcPayloadID, srcBlocks []BlockRef, dstPayloadID) ([]BlockRef, error)

nil or empty blocks triggers the Phase 11 dual-read shim (D-20) for legacy files that have no populated FileAttr.Blocks yet. Non-empty blocks is the CAS path with end-to-end BLAKE3 verification (INV-06).

Local stores provide fast, per-share block storage. Each share gets an isolated local storage directory.

The pkg/blockstore/local.LocalStore interface defines the contract:

type LocalStore interface {
// ReadAt reads block data at the given offset
ReadAt(ctx context.Context, blockID string, p []byte, offset int64) (int, error)
// WriteAt writes block data at the given offset
WriteAt(ctx context.Context, blockID string, data []byte, offset int64) (int, error)
// Delete removes a block from local storage
Delete(ctx context.Context, blockID string) error
// Exists checks if a block exists locally
Exists(ctx context.Context, blockID string) (bool, error)
// Flush ensures all pending writes are persisted
Flush(ctx context.Context, blockID string) error
// List returns all block IDs in local storage
List(ctx context.Context) ([]string, error)
// Close releases resources
Close() error
}
package mylocal
import (
"context"
)
type MyLocalStore struct {
basePath string
// Your backend (filesystem, NVMe, etc.)
}
func New(basePath string) (*MyLocalStore, error) {
// Initialize your storage backend
return &MyLocalStore{basePath: basePath}, nil
}
func (s *MyLocalStore) ReadAt(ctx context.Context, blockID string, p []byte, offset int64) (int, error) {
// Read block data from your backend
// Return bytes read and any error
}
func (s *MyLocalStore) WriteAt(ctx context.Context, blockID string, data []byte, offset int64) (int, error) {
// Write block data to your backend
// Return bytes written and any error
}
// ... implement remaining interface methods

See pkg/blockstore/local/fs/ for a complete filesystem-backed local store implementation that handles:

  • Per-share isolated directories
  • Atomic writes
  • Block listing for sync operations

Phase 10 additions (flag-gated, experimental)

Section titled “Phase 10 additions (flag-gated, experimental)”

v0.15.0 Phase 10 adds a hybrid append-log + content-addressed (CAS) chunk tier inside *fs.FSStore, gated by the use_append_log config flag (defaults to false; see docs/CONFIGURATION.md). New methods on *fs.FSStore (NOT yet on the LocalStore interface): AppendWrite, StoreChunk, ReadChunk, HasChunk, DeleteChunk, DeleteAppendLog, TruncateAppendLog, StartRollup. The LocalStore interface is deliberately unchanged in v0.15.0 Phase 10 — LSL-07 narrows it in Phase 11 (A2). Existing LocalStore implementations in v0.15.0 Phase 10 do NOT need to change to stay compatible.

A new per-file metadata surface metadata.RollupStore (two methods: SetRollupOffset, GetRollupOffset) is required only when use_append_log=true. The built-in memory, Badger, and Postgres backends all implement it. New metadata backends targeting the hybrid tier must add equivalent persistence keyed by payloadID, backed by an atomic upsert (metadata is source of truth for the log’s rollup_offset; see docs/ARCHITECTURE.md INV-03). Backends that stay on the legacy write path need not implement RollupStore.

Experimental: Do not enable use_append_log in production before v0.15.0 Phase 11 lands — the blocks/ directory grows unbounded without Phase 11’s mark-sweep GC.

Test your local store with the conformance suite:

package mylocal_test
import (
"testing"
"github.com/marmos91/dittofs/pkg/blockstore/local/localtest"
)
func TestMyLocalStore(t *testing.T) {
store, cleanup := createTestStore(t)
defer cleanup()
localtest.RunLocalStoreTests(t, store)
}

Remote stores provide durable block storage shared across shares via ref counting.

The pkg/blockstore/remote.RemoteStore interface defines the contract:

type RemoteStore interface {
// ReadBlock reads an entire block from remote storage
ReadBlock(ctx context.Context, blockID string) ([]byte, error)
// WriteBlock writes an entire block to remote storage
WriteBlock(ctx context.Context, blockID string, data []byte) error
// DeleteBlock removes a block from remote storage
DeleteBlock(ctx context.Context, blockID string) error
// HealthCheck verifies the remote store is accessible
HealthCheck(ctx context.Context) error
// Close releases resources
Close() error
}
package myremote
import (
"context"
)
type MyRemoteStore struct {
client *MyCloudClient
bucket string
}
func New(config Config) (*MyRemoteStore, error) {
client, err := connectToCloud(config)
if err != nil {
return nil, err
}
return &MyRemoteStore{client: client, bucket: config.Bucket}, nil
}
func (s *MyRemoteStore) ReadBlock(ctx context.Context, blockID string) ([]byte, error) {
// Fetch block from cloud storage
return s.client.GetObject(ctx, s.bucket, blockID)
}
func (s *MyRemoteStore) WriteBlock(ctx context.Context, blockID string, data []byte) error {
// Upload block to cloud storage
return s.client.PutObject(ctx, s.bucket, blockID, data)
}
func (s *MyRemoteStore) DeleteBlock(ctx context.Context, blockID string) error {
// Remove block from cloud storage (idempotent)
err := s.client.DeleteObject(ctx, s.bucket, blockID)
if err != nil && !isNotFoundError(err) {
return err
}
return nil
}
func (s *MyRemoteStore) HealthCheck(ctx context.Context) error {
// Verify connectivity (e.g., HEAD bucket)
return s.client.HeadBucket(ctx, s.bucket)
}
func (s *MyRemoteStore) Close() error {
return s.client.Close()
}

Remote stores are shared across shares via ref counting:

  • When a share is created referencing a remote store, the ref count increments
  • When a share is removed, the ref count decrements
  • When the ref count reaches zero, Close() is called

This means your Close() implementation should release all resources (connections, goroutines, etc.).

See pkg/blockstore/remote/s3/ for a production S3 remote store implementation with:

  • Configurable retry with exponential backoff
  • Health check via HEAD bucket
  • Efficient multipart uploads for large blocks

v0.15.0 (Phase 11 / A2) routes all new uploads through a content-addressable keyspace cas/{hh}/{hh}/{hex} and verifies every byte downloaded from the remote against the expected BLAKE3 hash. Two new contract methods are required for any RemoteStore implementation that wants to participate in the v0.15.0 write path; backends remaining on the legacy {payloadID}/block-{N} keyspace continue to work via the dual-read shim until Phase 14 (A5).

// WriteBlockWithHash uploads data under the CAS key derived from h
// and sets a backend-native object-metadata header carrying the hash.
//
// Semantics:
// - The key MUST be derived from h via FormatCASKey (cas/{hh}/{hh}/{hex}).
// - The backend-native object metadata MUST set "content-hash" to
// "blake3:" + hex(h). For S3, this becomes the user-metadata header
// x-amz-meta-content-hash. For other backends, set the equivalent
// custom-metadata field.
// - The PUT MUST be atomic: either the object exists at the CAS key
// with the correct bytes AND the metadata header, or it does not
// exist at all.
// - The call MUST be idempotent: re-uploading the same h with the
// same bytes is a no-op (or an overwrite that yields identical
// state). This is what makes the syncer's restart-recovery janitor
// safe.
// - Errors are returned as typed values mapped through
// internal/adapter/common/.
WriteBlockWithHash(ctx context.Context, blockKey string, hash ContentHash, data []byte) error

External tooling (e.g. aws s3api head-object) MUST be able to verify the header without DittoFS metadata — this is the BSCAS-06 external verifier criterion.

// ReadBlockVerified reads the object at the CAS key derived from h
// and verifies its bytes against h end-to-end before returning them
// to the caller.
//
// Semantics:
// - HEAD-style pre-check: if the backend exposes the content-hash
// header cheaply (S3 GetObject returns it in the same response,
// so no extra round-trip is needed), reject early with
// ErrCASContentMismatch when the header does not match h.
// - Streaming verification: the body is fed to a blake3.Hasher as
// the caller reads it. On EOF, hasher.Sum(nil) MUST equal h or
// the call returns ErrCASContentMismatch and the buffer is
// discarded -- corrupt bytes MUST NOT be surfaced upstream.
// - The streaming verifier sees bytes once (zero extra allocation).
// - Verification is hard-required (INV-06): there is no opt-out
// knob.
ReadBlockVerified(ctx context.Context, blockKey string, expected ContentHash) ([]byte, error)

Header pre-check + streaming recompute is “fail-closed twice” by design: the header alone is not sufficient (would trust the backend to never silently corrupt); recompute alone wastes a body read on a definitively-wrong object.

Test your remote store with the conformance suite:

package myremote_test
import (
"testing"
"github.com/marmos91/dittofs/pkg/blockstore/remote/remotetest"
)
func TestMyRemoteStore(t *testing.T) {
store, cleanup := createTestStore(t)
defer cleanup()
remotetest.RunRemoteStoreTests(t, store)
}

The v0.15.0 conformance suite extends remotetest with scenarios for WriteBlockWithHash (header is set; key shape matches cas/...; re-PUT is idempotent) and ReadBlockVerified (round-trip succeeds; header-mismatch returns ErrCASContentMismatch; body-mismatch returns ErrCASContentMismatch; corrupt bytes never surface upstream).

All store implementations must be thread-safe. Multiple goroutines will access the store concurrently.

Always respect context cancellation, especially for remote stores where network calls can be slow:

func (s *MyStore) ReadBlock(ctx context.Context, blockID string) ([]byte, error) {
if err := ctx.Err(); err != nil {
return nil, err
}
// Proceed with operation
}
  • Local store errors should be wrapped with meaningful context
  • Remote store errors should distinguish transient (retry-able) from permanent failures
  • Delete operations should be idempotent (deleting a non-existent block is not an error)
  • Local stores: Minimize syscalls, use buffered I/O, consider memory-mapped files
  • Remote stores: Use connection pooling, implement retry with backoff, batch operations where possible
  1. Conformance tests: Run the provided test suites (localtest/remotetest)
  2. Concurrency tests: Verify thread safety with parallel reads/writes
  3. Error handling tests: Test behavior with canceled contexts, network failures
  4. Integration tests: Test with the full DittoFS stack (create share, mount, read/write)
  1. Not making Delete idempotent: Deleting a non-existent block should succeed
  2. Ignoring context cancellation: Long operations should check ctx.Err() periodically
  3. Unsafe concurrent access: Use proper synchronization for shared state
  4. Resource leaks: Ensure Close() releases all resources (connections, goroutines, file handles)

Add your store type to the configuration system:

pkg/config/stores.go
func createLocalStore(config LocalStoreConfig) (local.LocalStore, error) {
switch config.Type {
case "fs":
return fs.New(config.Path)
case "memory":
return memory.New()
case "mylocal":
return mylocal.New(config.Path)
default:
return nil, fmt.Errorf("unknown local store type: %s", config.Type)
}
}

Users can then create your store via CLI:

Terminal window
./dfsctl store block add --kind local --name my-store --type mylocal \
--config '{"path":"/data/blocks"}'
  • Interface Definitions: pkg/blockstore/local/local.go, pkg/blockstore/remote/remote.go
  • Reference Implementations:
    • Local: pkg/blockstore/local/fs/, pkg/blockstore/local/memory/
    • Remote: pkg/blockstore/remote/s3/, pkg/blockstore/remote/memory/
    • Metadata: pkg/metadata/store/memory/, pkg/metadata/store/badger/, pkg/metadata/store/postgres/
  • Conformance Tests: pkg/blockstore/local/localtest/, pkg/blockstore/remote/remotetest/, pkg/metadata/storetest/
  • Architecture: docs/ARCHITECTURE.md
  • Configuration: docs/CONFIGURATION.md
  • Contributing: docs/CONTRIBUTING.md