Skip to content

NFS

This document covers the NFS protocol fundamentals, DittoFS’s implementation of NFSv3, NFSv4.0, and NFSv4.1, and practical usage for clients and developers.


NFS (Network File System) is a distributed file system protocol originally developed by Sun Microsystems in 1984. It allows a client to access files over a network as if they were on local storage.

VersionYearKey Features
NFSv21989Original version, 32-bit file sizes, UDP only
NFSv3199564-bit file sizes, TCP support, async writes, WCC
NFSv42000Stateful, ACLs, compound operations, no mount protocol
NFSv4.12010Parallel NFS (pNFS), sessions, backchannel
NFSv4.22016Server-side copy, sparse files

DittoFS implements NFSv3, NFSv4.0, and NFSv4.1 — covering the stateless simplicity of v3 through the stateful, session-based model of v4.1 with delegations, ACLs, CB_NOTIFY, and Kerberos via RPCSEC_GSS.

NFS uses a layered architecture with multiple supporting protocols:

+------------------------------------------------------------+
| NFS Application |
| (file operations: read, write, etc.) |
+------------------------------------------------------------+
| NFSv3 Protocol (Program 100003) |
| 22 procedures for file system operations |
+------------------------------------------------------------+
| Mount Protocol (Program 100005) |
| 6 procedures for mounting exported directories |
+------------------------------------------------------------+
| RPC (Remote Procedure Call) |
| Message framing, authentication |
+------------------------------------------------------------+
| XDR (External Data Representation) |
| Binary encoding/decoding |
+------------------------------------------------------------+
| TCP/IP |
| Transport layer |
+------------------------------------------------------------+

Request/Response Flow:

+--------------+ +--------------+
| Client | | Server |
+--------------+ +--------------+
| |
| 1. TCP Connection (port 12049) |
| -------------------------------------> |
| |
| 2. MOUNT /export -------------------> |
| [RPC Call: Program 100005, v3] |
| |
| <------------------ Root file handle |
| [RPC Reply: OK + handle + auth] |
| |
| 3. LOOKUP "file.txt" ---------------> |
| [NFS Call: handle + name] |
| |
| <------------------ File handle |
| [NFS Reply: OK + handle + attrs] |
| |
| 4. READ (offset=0, count=4096) -----> |
| [NFS Call: handle + offset + len] |
| |
| <------------------ Data + EOF flag |
| [NFS Reply: OK + attrs + data] |
| |
| 5. UMOUNT /export ------------------> |
| |
+----------------------------------------+

Key Design Principles:

  • Stateless Operations (v3): Each request contains all information needed to process it. The server can restart without affecting client state.
  • Stateful Sessions (v4/v4.1): Sessions track client state, enabling delegations and lock management.
  • Idempotent Procedures: Most operations can be safely retried if a response is lost.
  • Weak Cache Consistency (WCC): Responses include pre-operation and post-operation attributes so clients can detect concurrent modifications.

NFS uses ONC RPC (Open Network Computing Remote Procedure Call), defined in RFC 5531. RPC provides message framing over TCP, procedure identification (program, version, procedure numbers), an authentication framework, and request/reply matching via transaction IDs (XIDs).

RPC Message Structure:

+------------------------------------------------------------+
| RPC Record Fragment Header |
| (4 bytes) |
+------------------------------------------------------------+
| RPC Call/Reply Header |
| (variable) |
+------------------------------------------------------------+
| Procedure Arguments |
| or Results (variable) |
+------------------------------------------------------------+

Fragment Header: TCP connections use record marking to frame RPC messages. Bit 31 is the last-fragment flag (1 = last, 0 = more fragments follow). Bits 0-30 encode the fragment length in bytes. For example, 0x80000064 means last fragment, length 100 bytes.

RPC Call Header Fields:

OffsetField
0-3XID (Transaction ID, echoed in reply)
4-7Message Type (0 = CALL, 1 = REPLY)
8-11RPC Version (must be 2)
12-15Program Number (100003 = NFS, 100005 = Mount)
16-19Program Version (3 for NFSv3/Mount v3)
20-23Procedure Number (0-21 for NFSv3)
24+Credentials (OpaqueAuth structure)
variableVerifier (OpaqueAuth structure)

RPC Reply Accept States:

CodeName
0SUCCESS
1PROG_UNAVAIL
2PROG_MISMATCH
3PROC_UNAVAIL
4GARBAGE_ARGS
5SYSTEM_ERR

XDR (External Data Representation), defined in RFC 4506, provides a canonical binary encoding for network transmission.

Key rules:

  1. Big-endian byte order for all integers
  2. 4-byte alignment for all data items
  3. Zero-padding to reach 4-byte boundaries

Basic Types:

TypeSizeDescription
int4 bytesSigned 32-bit integer
unsigned int4 bytesUnsigned 32-bit integer
hyper8 bytesSigned 64-bit integer
unsigned hyper8 bytesUnsigned 64-bit integer
bool4 bytesBoolean (0 = false, 1 = true)

Variable-Length Data (Opaque):

Encoded as a 4-byte length prefix, followed by the data bytes, followed by 0-3 padding bytes to reach a 4-byte boundary. Padding formula: (4 - (length % 4)) % 4.

Strings follow the same encoding: 4-byte length prefix, UTF-8 data, zero-padding.

Optional values use a boolean discriminator (4 bytes): if 1, the value follows; if 0, no value is present.

The Mount protocol (Program 100005, Version 3) is a companion protocol to NFSv3 used to obtain the initial file handle for an exported directory, list available exports, and track active mounts. NFSv4 does not use the mount protocol; it uses PUTROOTFH and LOOKUP compound operations instead.

Mount Procedures:

ProcNamePurpose
0NULLConnectivity test (no-op)
1MNTMount an export, returns root file handle
2DUMPList active mounts
3UMNTUnmount an export
4UMNTALLUnmount all exports for this client
5EXPORTList available exports

MNT Request: Contains a single string field (dirpath), e.g., "/export".

MNT Response (on success): Status (0 = OK), root file handle (opaque, up to 64 bytes), and a list of supported auth flavors.

Mount Status Codes:

CodeNameDescription
0MNT_OKSuccess
1MNT_EPERMPermission denied
2MNT_ENOENTExport path not found
5MNT_EIOI/O error
13MNT_EACCESAccess denied
20MNT_ENOTDIRNot a directory
22MNT_EINVALInvalid argument
63MNT_ENAMETOOLONGPath too long
10004MNT_ENOTSUPPNot supported
10006MNT_ESERVERFAULTServer error

NFSv3 defines 22 procedures (0-21):

ProcNameDescription
0NULLNo-op, connectivity test
1GETATTRGet file attributes
2SETATTRSet file attributes
3LOOKUPLook up file name in directory
4ACCESSCheck access permissions
5READLINKRead symbolic link target
6READRead file data
7WRITEWrite file data
8CREATECreate regular file
9MKDIRCreate directory
10SYMLINKCreate symbolic link
11MKNODCreate special device
12REMOVEDelete file
13RMDIRDelete directory
14RENAMERename file/directory
15LINKCreate hard link
16READDIRRead directory entries
17READDIRPLUSRead directory entries with attributes
18FSSTATGet file system statistics
19FSINFOGet file system info (max sizes, etc.)
20PATHCONFGet POSIX path configuration
21COMMITCommit cached data to stable storage

Write Stability Levels (for WRITE procedure):

LevelNameDescription
0UNSTABLEData may be cached; requires COMMIT
1DATA_SYNCData committed, metadata may be cached
2FILE_SYNCBoth data and metadata committed

WCC (Weak Cache Consistency): Mutating operations return pre-operation attributes (size, mtime, ctime) and post-operation attributes (full fattr3). Clients use WCC to detect stale caches, update attributes after operations, and detect concurrent modifications by other clients.

File handles are opaque identifiers that uniquely identify files and directories:

  • Generated by the server
  • Opaque to clients (clients must not interpret them)
  • Persistent across server restarts for production stores
  • Maximum 64 bytes per RFC 1813

DittoFS encodes share and file information in handles. The format varies by metadata store:

  • Memory store: In-memory IDs (ephemeral)
  • BadgerDB: Path-based handles (persistent)
  • PostgreSQL: Share name + UUID (distributed)

When a handle becomes invalid (file deleted, server restarted with ephemeral storage), the server returns NFS3ERR_STALE. Clients should discard cached information and re-lookup the file.

NFS uses RPC authentication flavors:

FlavorValueDescription
AUTH_NULL0No authentication
AUTH_UNIX1Unix UID/GID credentials
AUTH_SHORT2Short-hand credential
RPCSEC_GSS6Kerberos/GSS-API (NFSv4)

AUTH_UNIX format: Stamp (4 bytes), machine name (string), UID (4 bytes), GID (4 bytes), supplementary GIDs (array, max 16).

Security note: AUTH_UNIX credentials are not cryptographically secured and can be spoofed. NFSv4 adds RPCSEC_GSS for Kerberos-based authentication. For production deployments, consider running on trusted networks, enabling Kerberos (NFSv4/v4.1), or using VPN/network-level encryption.

NFS Status Codes:

CodeNameDescription
0NFS3_OKSuccess
1NFS3ERR_PERMNot owner
2NFS3ERR_NOENTNo such file/directory
5NFS3ERR_IOI/O error
13NFS3ERR_ACCESPermission denied
17NFS3ERR_EXISTFile exists
20NFS3ERR_NOTDIRNot a directory
21NFS3ERR_ISDIRIs a directory
22NFS3ERR_INVALInvalid argument
27NFS3ERR_FBIGFile too large
28NFS3ERR_NOSPCNo space on device
30NFS3ERR_ROFSRead-only file system
63NFS3ERR_NAMETOOLONGName too long
66NFS3ERR_NOTEMPTYDirectory not empty
70NFS3ERR_STALEStale file handle
10001NFS3ERR_BADHANDLEInvalid file handle
10002NFS3ERR_NOT_SYNCUpdate sync mismatch
10004NFS3ERR_NOTSUPPOperation not supported

Internal errors are mapped to NFS status codes in pkg/metadata/errors.go.

As of v0.15.0 (Phase 09 ADAPT-03), every metadata.ErrorCode value is translated to an NFSv3 or NFSv4 status code by a single shared table in internal/adapter/common/errmap.go. The accessors are:

  • common.MapToNFS3(err) uint32 — NFSv3 status (e.g., NFS3ERR_NOENT)
  • common.MapToNFS4(err) uint32 — NFSv4 status (e.g., NFS4ERR_NOENT)

Both NFSv3 and NFSv4 handlers consume the same table — adding a new error code requires exactly one struct-literal row edit that populates all three protocol columns (NFSv3, NFSv4, SMB) at once. The Go type system enforces this: you cannot add a row without filling every column.

Unwrapping uses errors.As, so wrapped StoreError values (fmt.Errorf("...: %w", storeErr)) map correctly in every handler path.

The NFSv3 audit wrapper at internal/adapter/nfs/xdr/errors.go (MapStoreErrorToNFSStatus) is preserved as a thin logging layer: its body calls common.MapToNFS3(err) and adds a severity-based log dispatch (Warn for client-side faults, Error for server-side I/O/space exhaustion) with structured fields (operation, code, message, path, client). Callers that want raw mapping call common.MapToNFS3 directly; callers that want audit output call xdr.MapStoreErrorToNFSStatus.

metadata.ErrLocked, ErrDeadlock, ErrGracePeriod, and other lock-operation codes have different NFS status codes in lock context (NLM_LOCK / NFSv4 LOCK) versus general I/O context (READ/WRITE). The dedicated common.MapLockToNFS3 / common.MapLockToNFS4 accessors consult the parallel lockErrorMap table first and fall through to errorMap for non-lock codes. See internal/adapter/common/lock_errmap.go for the exact divergences (e.g., ErrDeadlockNFS4ERR_DEADLOCK in lock context vs. NFS4ERR_DEADLOCK also in general context — NFSv4 converged; SMB diverges).

test/e2e/cross_protocol_test.go:TestCrossProtocol_ErrorConformance table-drives every triggerable code through real NFS/SMB mounts and asserts the kernel delivers the expected errno. Exotic codes that cannot be e2e-triggered (quota, grace-period, connection-limit) are covered by internal/adapter/common/errmap_test.go:TestExoticErrorCodes. Both tiers iterate over the same common/ tables — adding a new code without adding a test case fails TestErrorMapCoverage at CI time.


ProcedureStatusNotes
NULLImplemented
MNTImplemented
UMNTImplemented
UMNTALLImplemented
DUMPImplemented
EXPORTImplemented

Read Operations:

ProcedureStatusNotes
NULLImplemented
GETATTRImplemented
SETATTRImplemented
LOOKUPImplemented
ACCESSImplemented
READImplemented
READDIRImplemented
READDIRPLUSImplemented
FSSTATImplemented
FSINFOImplemented
PATHCONFImplemented
READLINKImplemented

Write Operations:

ProcedureStatusNotes
WRITEImplemented
CREATEImplemented
MKDIRImplemented
REMOVEImplemented
RMDIRImplemented
RENAMEImplemented
LINKImplemented
SYMLINKImplemented
MKNODImplementedLimited support
COMMITImplemented

Total: 28 procedures fully implemented (6 mount + 22 NFS).

NFSv4.0 uses compound operations instead of individual RPC procedures. All operations are bundled into COMPOUND requests.

OperationStatusNotes
ACCESSImplemented
CLOSEImplemented
COMMITImplemented
CREATEImplemented
DELEGRETURNImplemented
GETATTRImplemented
GETFHImplemented
ILLEGALImplemented
LINKImplemented
LOCK / LOCKT / LOCKUImplemented
LOOKUPImplemented
LOOKUPPImplemented
NVERIFYImplemented
NULLImplemented
OPENImplemented
PUTFHImplemented
PUTPUBFHImplemented
PUTROOTFHImplemented
READImplemented
READDIRImplemented
READLINKImplemented
REMOVEImplemented
RENAMEImplemented
RENEWImplemented
RESTOREFHImplemented
SAVEFHImplemented
SECINFOImplemented
SETATTRImplemented
SETCLIENTIDImplemented
VERIFYImplemented
WRITEImplemented

NFSv4.1 extends v4.0 with session-based operation, backchannel callbacks, and additional operations.

OperationStatusNotes
BACKCHANNEL_CTLImplemented
BIND_CONN_TO_SESSIONImplemented
CREATE_SESSIONImplemented
DESTROY_CLIENTIDImplemented
DESTROY_SESSIONImplemented
EXCHANGE_IDImplemented
FREE_STATEIDImplemented
GET_DIR_DELEGATIONImplementedDirectory delegation with CB_NOTIFY
RECLAIM_COMPLETEImplemented
SEQUENCEImplemented
TEST_STATEIDImplemented

DittoFS includes an embedded portmapper (RFC 1057) that enables standard NFS service discovery without requiring a system-level rpcbind daemon.

NFS clients traditionally rely on a portmapper (port 111) to discover which port an NFS server is listening on. Without a portmapper, clients require explicit port options (-o port=12049,mountport=12049), and standard tools like rpcinfo and showmount do not work.

The embedded portmapper solves this by:

  • Registering all DittoFS services (NFS, MOUNT, NLM, NSM) automatically on startup
  • Responding to standard portmap queries via TCP and UDP
  • Running on an unprivileged port (default 10111) to avoid requiring root
  • Enabling rpcinfo and showmount to discover DittoFS services

With the portmapper running, standard NFS tools work:

Terminal window
# Query registered services
rpcinfo -p localhost -n 10111
# Show available exports
showmount -e localhost

The portmapper is disabled by default. Enable it via dfsctl:

Terminal window
# Check current settings
dfsctl adapter settings nfs
# Change the portmapper port
dfsctl adapter settings nfs --set portmapper_port=10111
# Disable the portmapper entirely
dfsctl adapter settings nfs --set portmapper_enabled=false

Or via environment variables:

Terminal window
DITTOFS_ADAPTERS_NFS_PORTMAPPER_PORT=10111
DITTOFS_ADAPTERS_NFS_PORTMAPPER_ENABLED=false

The embedded portmapper follows standard security practices:

  • SET/UNSET restricted to localhost: Only local clients can register or unregister services
  • CALLIT (procedure 5) omitted: Prevents DDoS amplification attacks
  • Connection limits: TCP connections are capped at 64 concurrent
  • Non-privileged port: Default port 10111 avoids requiring root privileges

If the portmapper fails to start (e.g., port already in use), NFS continues to operate normally. Clients just need to specify ports explicitly in mount options.


When the portmapper runs on the standard port 111 (requires root or CAP_NET_BIND_SERVICE), NFS clients can auto-discover ports and mount commands are simplified:

Terminal window
# Configure portmapper on standard port (requires root)
dfsctl adapter settings nfs --set portmapper_port=111
# Linux - no port options needed, client queries portmapper automatically
sudo mkdir -p /mnt/nfs
sudo mount -t nfs -o tcp localhost:/export /mnt/nfs
# macOS
mkdir -p /tmp/nfs
mount -t nfs -o tcp localhost:/export /tmp/nfs

When the portmapper is disabled or running on a non-standard port, specify the NFS port explicitly:

Terminal window
# Linux
sudo mkdir -p /mnt/nfs
sudo mount -t nfs -o tcp,port=12049,mountport=12049 localhost:/export /mnt/nfs
# macOS (sudo not required)
mkdir -p /tmp/nfs
mount -t nfs -o tcp,port=12049,mountport=12049 localhost:/export /tmp/nfs
# macOS may require resvport on some configurations
mount -t nfs -o tcp,port=12049,mountport=12049,resvport localhost:/export /tmp/nfs
# Unmount
sudo umount /mnt/nfs # Linux
umount /tmp/nfs # macOS

dittofs/
+-- pkg/adapter/nfs/
| +-- nfs_adapter.go # NFS adapter implementing Adapter interface
| +-- nfs_connection.go # Connection handling
| +-- config.go # NFS-specific configuration
|
+-- internal/adapter/nfs/
+-- dispatch.go # Procedure routing
+-- bufpool.go # Buffer pooling for performance
+-- rpc/
| +-- message.go # RPC message structures
| +-- parser.go # RPC parsing and reply building
| +-- auth.go # Authentication parsing
| +-- constants.go # RPC constants
+-- xdr/
| +-- decode.go # XDR decoding helpers
| +-- encode.go # XDR encoding helpers
| +-- attributes.go # File attribute encoding
| +-- filehandle.go # File handle utilities
| +-- time.go # NFS time format conversion
+-- types/
| +-- constants.go # NFS constants
| +-- types.go # NFS type definitions
+-- mount/handlers/
| +-- mount.go # MNT procedure
| +-- umount.go # UMNT procedure
| +-- export.go # EXPORT procedure
| +-- dump.go # DUMP procedure
| +-- constants.go # Mount protocol constants
+-- v3/handlers/
| +-- null.go # NULL procedure
| +-- getattr.go # GETATTR procedure
| +-- setattr.go # SETATTR procedure
| +-- lookup.go # LOOKUP procedure
| +-- access.go # ACCESS procedure
| +-- read.go # READ procedure
| +-- write.go # WRITE procedure
| +-- create.go # CREATE procedure
| +-- mkdir.go # MKDIR procedure
| +-- remove.go # REMOVE procedure
| +-- rmdir.go # RMDIR procedure
| +-- rename.go # RENAME procedure
| +-- readdir.go # READDIR procedure
| +-- readdirplus.go # READDIRPLUS procedure
| +-- commit.go # COMMIT procedure
+-- v4/handlers/
| +-- compound.go # COMPOUND request dispatch
| +-- handler.go # NFSv4 handler context
| +-- open.go # OPEN (stateful file access)
| +-- close.go # CLOSE
| +-- lock.go # LOCK / LOCKT / LOCKU
| +-- delegreturn.go # DELEGRETURN
| +-- setclientid.go # SETCLIENTID
| +-- secinfo.go # SECINFO
| +-- ... # All other v4.0 operations
+-- v4/v41/handlers/
+-- exchange_id.go # EXCHANGE_ID (client identification)
+-- create_session.go # CREATE_SESSION
+-- destroy_session.go # DESTROY_SESSION
+-- sequence.go # SEQUENCE (slot management)
+-- bind_conn_to_session.go
+-- backchannel_ctl.go # Backchannel setup for CB_NOTIFY
+-- get_dir_delegation.go # GET_DIR_DELEGATION
+-- reclaim_complete.go # RECLAIM_COMPLETE
+-- destroy_clientid.go # DESTROY_CLIENTID
+-- free_stateid.go # FREE_STATEID
+-- test_stateid.go # TEST_STATEID
  1. TCP connection accepted
  2. RPC message parsed (rpc/message.go)
  3. Program/version/procedure validated
  4. Auth context extracted (dispatch.go:ExtractAuthContext)
  5. Procedure handler dispatched
  6. Handler calls repository methods
  7. Response encoded and sent

Mount Protocol (internal/adapter/nfs/mount/handlers/)

  • MNT: Validates export access, records mount, returns root handle
  • UMNT: Removes mount record
  • EXPORT: Lists available exports
  • DUMP: Lists active mounts (can be restricted)

NFSv3 Core (internal/adapter/nfs/v3/handlers/)

  • LOOKUP: Resolve name in directory to file handle
  • GETATTR: Get file attributes
  • SETATTR: Update attributes (size, mode, times)
  • READ: Read file content (uses per-share block store)
  • WRITE: Write file content (coordinates metadata + per-share block store)
  • CREATE: Create file
  • MKDIR: Create directory
  • REMOVE: Delete file
  • RMDIR: Delete empty directory
  • RENAME: Move/rename file
  • READDIR / READDIRPLUS: List directory entries

NFSv4 Compound Operations (internal/adapter/nfs/v4/handlers/)

  • COMPOUND: Dispatches a sequence of operations in a single RPC call
  • OPEN / CLOSE: Stateful file access with share reservations
  • LOCK / LOCKT / LOCKU: Byte-range locking
  • DELEGRETURN: Return a delegation to the server
  • SECINFO: Security flavor negotiation

NFSv4.1 Session Operations (internal/adapter/nfs/v4/v41/handlers/)

  • EXCHANGE_ID: Client identification and capability negotiation
  • CREATE_SESSION / DESTROY_SESSION: Session lifecycle
  • SEQUENCE: Per-request slot and sequence management
  • GET_DIR_DELEGATION: Request directory delegation with CB_NOTIFY
  • BACKCHANNEL_CTL: Configure backchannel for server-initiated callbacks

WRITE operations require coordination between metadata and per-share block stores:

// 1. Update metadata (validates permissions, updates size/timestamps)
attr, preSize, preMtime, preCtime, err := metadataStore.WriteFile(handle, newSize, authCtx)
// 2. Resolve per-share block store from file handle
blockStore, err := rt.GetBlockStoreForHandle(ctx, handle)
// 3. Write actual data via per-share block store
err = blockStore.WriteAt(ctx, string(attr.PayloadID), data, offset)
// 4. Return updated attributes to client for cache consistency

The metadata store:

  • Validates write permission
  • Returns pre-operation attributes (for WCC data)
  • Updates file size if extended
  • Updates mtime/ctime timestamps
  • Ensures PayloadID exists (content-addressed block reference)

Large I/O operations use buffer pools (internal/adapter/nfs/bufpool.go):

  • Reduces GC pressure
  • Reuses buffers for READ/WRITE
  • Automatically sizes based on request
internal/adapter/nfs/dispatch.go
// NFS dispatch table - maps procedure numbers to handlers
var NfsDispatchTable = map[uint32]*nfsProcedure{
types.NFSProcNull: {Name: "NULL", Handler: handleNFSNull},
types.NFSProcGetAttr: {Name: "GETATTR", Handler: handleNFSGetAttr},
types.NFSProcSetAttr: {Name: "SETATTR", Handler: handleNFSSetAttr},
types.NFSProcLookup: {Name: "LOOKUP", Handler: handleNFSLookup},
types.NFSProcRead: {Name: "READ", Handler: handleNFSRead},
types.NFSProcWrite: {Name: "WRITE", Handler: handleNFSWrite},
// ... all 22 procedures
}

Each handler follows the same pattern:

  1. Check context cancellation
  2. Validate request
  3. Get stores from registry (metadata store + per-share block store)
  4. Perform operation via store methods
  5. Build and return response

DittoFS supports NFSv4.1 directory delegations (RFC 8881 Section 18.39), allowing clients to cache directory listings and receive change notifications instead of re-issuing READDIR after every mutation.

A directory delegation grants a client the right to cache the contents of a directory. While the delegation is held, the server sends CB_NOTIFY callbacks whenever the directory changes, so the client can update its cache without a round-trip READDIR.

Clients request directory delegations via the GET_DIR_DELEGATION operation, specifying a notification bitmask indicating which change types they want to receive:

Notification TypeValueTrigger
NOTIFY4_CHANGE_CHILD_ATTRS0x01Child file/directory attributes changed
NOTIFY4_CHANGE_DIR_ATTRS0x02Directory’s own attributes changed (mode, owner, size)
NOTIFY4_REMOVE_ENTRY0x04Entry removed from directory (REMOVE, RMDIR)
NOTIFY4_ADD_ENTRY0x08Entry added to directory (CREATE, LINK, OPEN+CREATE)
NOTIFY4_RENAME_ENTRY0x10Entry renamed within directory (RENAME)

The server may grant the delegation with a subset of the requested notification types.

Notifications are delivered via CB_NOTIFY over the NFSv4.1 backchannel:

  1. A directory mutation occurs (CREATE, REMOVE, RENAME, LINK, OPEN+CREATE, SETATTR)
  2. The server batches the notification into the delegation’s pending queue
  3. After a configurable batch window (default 50ms), all pending notifications are flushed as a single CB_NOTIFY callback
  4. If the batch queue exceeds 100 entries, an immediate flush is triggered

This batching reduces backchannel traffic when many mutations happen in quick succession (e.g., tar xf extracting files).

Each directory-mutating NFSv4 operation triggers the appropriate notification:

OperationNotification TypeDetails
CREATENOTIFY4_ADD_ENTRYParent directory notified of new entry
REMOVENOTIFY4_REMOVE_ENTRYParent directory notified; if removed entry is a directory with its own delegation, that delegation is immediately revoked
RENAME (same dir)NOTIFY4_RENAME_ENTRYSingle notification with old and new names
RENAME (cross dir)NOTIFY4_RENAME_ENTRY + NOTIFY4_ADD_ENTRYSource directory gets RENAME, destination directory gets ADD
LINKNOTIFY4_ADD_ENTRYTarget directory notified of new hard link entry
OPEN+CREATENOTIFY4_ADD_ENTRYParent directory notified when OPEN creates a new file
SETATTR (on dir)NOTIFY4_CHANGE_DIR_ATTRSOnly for significant changes (mode, owner, group, size); atime-only changes are filtered

When a client modifies a directory that another client has delegated:

  1. Client B sends a mutation (e.g., CREATE) to a directory delegated to Client A
  2. The server detects the conflict via OriginClientID in the notification
  3. Client A’s delegation is recalled via CB_RECALL (non-blocking)
  4. Client B’s operation proceeds immediately (no waiting for recall completion)
  5. If Client A does not return the delegation within the lease period, it is forcibly revoked

When a directory is deleted (REMOVE/RMDIR), any directory delegations on that directory are immediately revoked (not just recalled). Since the directory no longer exists, there is no point in waiting for the client to return the delegation.

Directory delegation settings are managed via dfsctl adapter settings nfs:

SettingDefaultDescription
delegations_enabledtrueEnable/disable all delegations (file and directory)
max_delegations10000Maximum concurrent delegations across all clients
dir_deleg_batch_window_ms50Notification batch window in milliseconds
Terminal window
# Enable delegations
dfsctl adapter settings nfs --set delegations_enabled=true
# Set maximum delegations
dfsctl adapter settings nfs --set max_delegations=1000
# Adjust batch window (lower = more responsive, higher = less backchannel traffic)
dfsctl adapter settings nfs --set dir_deleg_batch_window_ms=100

Directory delegation metrics are exposed alongside file delegation metrics with a type label:

MetricTypeLabelsDescription
dittofs_nfs_delegations_granted_totalCountertype (file/directory)Total delegations granted
dittofs_nfs_delegations_recalled_totalCountertype, reasonTotal delegations recalled
dittofs_nfs_delegations_activeGaugetype (file/directory)Currently active delegations
dittofs_nfs_dir_notifications_sent_totalCounter-Total CB_NOTIFY batches sent
  • Ephemeral state: Directory delegations are lost on server restart (in-memory only)
  • Linux client support: The Linux NFS client does not currently request directory delegations; this feature is primarily useful for custom NFSv4.1 clients
  • No persistent notification queue: If the backchannel is unavailable when notifications flush, they are silently dropped

Terminal window
# Start server
./dfs start -log-level DEBUG
# Mount and test operations
sudo mount -t nfs -o tcp,port=12049,mountport=12049 localhost:/export /mnt/test
cd /mnt/test
# Test operations
ls -la # READDIR / READDIRPLUS
cat readme.txt # READ
echo "test" > new # CREATE + WRITE
mkdir foo # MKDIR
rm new # REMOVE
rmdir foo # RMDIR
mv file1 file2 # RENAME
ln -s target link # SYMLINK
ln file1 file2 # LINK (hard link)
Terminal window
# Run unit tests
go test ./...
# Run E2E tests (requires NFS client installed)
go test -v -timeout 30m ./test/e2e/...
# Run specific E2E suite
go test -v ./test/e2e -run TestE2E/memory/BasicOperations

TermDefinition
AUTH_NULLNo authentication flavor (flavor 0)
AUTH_UNIXUnix-style authentication with UID/GID (flavor 1)
BackchannelServer-to-client connection used for callbacks (NFSv4.1)
CB_NOTIFYCallback operation for directory change notifications
COMPOUNDNFSv4 request containing multiple operations
CookieOpaque value used for directory iteration (READDIR)
DelegationServer grants client exclusive or shared caching rights
EOFEnd of file indicator in READ responses
ExportA directory shared via NFS (like an SMB share)
File HandleOpaque identifier for a file/directory (max 64 bytes)
ftype3File type enum (regular, directory, symlink, etc.)
FSIDFile system identifier
nfstime3NFS time format (seconds + nanoseconds)
RPCSEC_GSSKerberos-based RPC security flavor (NFSv4)
RPCRemote Procedure Call — foundation protocol
sattr3Set attributes structure (for SETATTR, CREATE)
SessionNFSv4.1 construct tracking client connection state
Stale HandleA handle that is no longer valid
VerifierServer-unique value that changes on restart
WCCWeak Cache Consistency data (pre/post attributes)
XDRExternal Data Representation (encoding format)
XIDTransaction ID for matching requests/replies

  • RFC 1057 - RPC: Remote Procedure Call Protocol (Portmapper)
  • RFC 1094 - NFS: Network File System Protocol (Version 2)
  • RFC 1813 - NFS Version 3 Protocol Specification
  • RFC 4506 - XDR: External Data Representation Standard
  • RFC 5531 - RPC: Remote Procedure Call Protocol Specification Version 2
  • RFC 7530 - NFS Version 4 Protocol
  • RFC 8881 - NFS Version 4 Minor Version 1 Protocol
  • go-nfs - Another NFS implementation in Go
  • FUSE - Filesystem in Userspace