Component Overview

Concordance runs as a single Bun process per node. All components are wired together in main.ts via callback injection — no dependency injection framework, no service locator. Bun’s single-threaded event loop serializes all message handling naturally, eliminating the need for mutexes or locks.
main.ts
 └── Bun.serve()
      ├── HTTP Router         # REST API for KV operations + cluster management
      ├── WsHandler (/stream) # Client JSON-RPC over WebSocket
      ├── Raft Transport      # Peer-to-peer WebSocket (/raft)
      ├── RaftNode            # Consensus algorithm
      ├── RaftLog             # SQLite-backed durable log (raft.db)
      ├── FSM                 # Applies committed entries to KV store
      ├── KvStore             # SQLite-backed KV storage (state.db)
      ├── PeerManager         # WebSocket connections to other nodes
      └── SnapshotManager     # Full-state snapshots for log compaction

Raft Consensus

Concordance implements the Raft consensus algorithm from Ongaro and Ousterhout (2014). Every write goes through Raft to guarantee linearizable consistency across all nodes.

Roles

Each node is in one of three states at any time:
RoleDescription
LeaderAccepts client writes, replicates log entries to followers, sends heartbeats
FollowerAccepts log entries from the leader, votes in elections, serves reads from local state
CandidateRunning for leader after election timeout expires

Election

  1. A follower’s election timer expires (randomized 150—300ms)
  2. It increments its term, votes for itself, transitions to candidate
  3. It sends RequestVote RPCs to all peers
  4. If it receives votes from a majority (quorum), it becomes leader
  5. The new leader sends an immediate heartbeat (empty AppendEntries) to assert authority
  6. A noop entry is committed to establish the leader’s term in the log
The randomized election timeout (150—300ms) prevents split votes. The heartbeat interval (50ms) is well below the minimum election timeout, ensuring followers do not start unnecessary elections.

Log Replication

When the leader receives a write:
  1. It appends the command to its local log (SQLite raft.db)
  2. It sends AppendEntries RPCs to all followers
  3. Each follower checks log consistency at prevLogIndex / prevLogTerm
  4. On success, the follower appends the entries and acknowledges
  5. On failure (log gap or conflict), the leader decrements nextIndex and retries
  6. Once a majority of nodes have acknowledged, the entry is committed
  7. The leader advances commitIndex and resolves the client’s pending promise
  8. Followers learn of the new commitIndex via the next AppendEntries and apply locally

Timing Parameters

ParameterDefaultNotes
Election timeout150—300msRandomized per node to prevent split votes
Heartbeat interval50msMust be well below election timeout
Max entries per AppendEntries100Caps batch size for replication RPCs
Snapshot threshold10,000 entriesLog compaction trigger
Proposal timeout5,000msMax wait for quorum commit

Log Entry Types

TypeValueDescription
Command0KV operation (set, delete, batch)
Config1Cluster membership change
Noop2Leader establishment (committed after election)

SQLite Storage

Each node maintains two SQLite databases:

state.db (KV Store)

Stores the actual key-value data and change history. Uses WAL mode with PRAGMA synchronous=NORMAL for performance.
CREATE TABLE kv (
    namespace   TEXT    NOT NULL,
    key         TEXT    NOT NULL,
    value       TEXT    NOT NULL,       -- JSON-serialized
    version     INTEGER NOT NULL DEFAULT 1,
    updated_by  TEXT    NOT NULL,
    updated_at  INTEGER NOT NULL,       -- Unix timestamp ms
    expires_at  INTEGER,                -- Optional TTL
    PRIMARY KEY (namespace, key)
);

CREATE TABLE changelog (
    seq         INTEGER PRIMARY KEY AUTOINCREMENT,
    namespace   TEXT    NOT NULL,
    key         TEXT    NOT NULL,
    op          TEXT    NOT NULL CHECK(op IN ('set', 'delete', 'expire')),
    value       TEXT,                   -- JSON-serialized (null for deletes)
    version     INTEGER NOT NULL,
    actor       TEXT    NOT NULL,
    timestamp   INTEGER NOT NULL,
    tenant_id   TEXT    NOT NULL
);
Indexes: kv(namespace), kv(expires_at) WHERE expires_at IS NOT NULL, changelog(tenant_id, seq).

raft.db (Raft Log)

Stores the Raft log and persistent metadata. Uses PRAGMA synchronous=FULL because Raft correctness requires that committed entries survive crashes.
CREATE TABLE log (
    idx   INTEGER PRIMARY KEY,
    term  INTEGER NOT NULL,
    type  INTEGER NOT NULL,     -- 0=Command, 1=Config, 2=Noop
    data  BLOB    NOT NULL      -- Serialized command
);

CREATE TABLE meta (
    key   TEXT PRIMARY KEY,
    value TEXT NOT NULL          -- currentTerm, votedFor
);

Why Two Databases

Separating state from Raft log enables independent lifecycle management:
  • Snapshots serialize state.db as a single Uint8Array via bun:sqlite’s db.serialize(). The Raft log can then be compacted (entries deleted) without affecting the KV state.
  • Durability tradeoffs differ: the Raft log needs synchronous=FULL (every write is fsync’d), while the KV store uses synchronous=NORMAL (faster, safe because Raft guarantees replay on crash).
  • Recovery is straightforward: if state.db is lost, replay the Raft log. If raft.db is lost, request a snapshot from the leader.

Namespace Design

All keys are organized under tenant-scoped namespaces. The namespace is the first component of the compound primary key in state.db.

Format

tenant:{tenantId}/{scope_path}

Scope Resolution

Clients specify a high-level scope (e.g., “user”, “tenant”, “session”). Diminuendo resolves this to a full namespace using the authenticated identity:
Client ScopeResolved NamespaceAdditional Param
usertenant:{tid}/user:{uid}/preferences
tenanttenant:{tid}/settings
sessiontenant:{tid}/sessions/{sid}sessionId required
projecttenant:{tid}/projects/{pid}projectId required
devicetenant:{tid}/user:{uid}/devices/{did}deviceId optional
Additional internal namespaces (credentials, automations, skills, integrations, audit) are accessed by Diminuendo directly and are not exposed to clients as scopes.

Tenant Isolation

The extractTenantId() function parses the tenant ID from any namespace string. This enables:
  • Filtered change polling: GET /api/v1/changes?tenant=acme returns only changes for that tenant
  • Scoped pub/sub: changes are published to both tenant:{id} (broad) and ns:{namespace} (targeted) topics

Pub/Sub Flow

Real-time change notifications flow through Bun’s native pub/sub system:
   Raft commit


   FSM.apply()

       ├──► KvStore.set() or .delete()   (writes to state.db)

       └──► FSM.publishChange()

               ├──► server.publish("tenant:acme", notification)
               │         │
               │         └──► All WebSocket clients subscribed to tenant:acme

               └──► server.publish("ns:tenant:acme/user:u1/preferences", notification)

                         └──► WebSocket clients subscribed to that specific namespace

Subscription Lifecycle

  1. Client connects to /stream WebSocket endpoint (with Bearer token auth)
  2. Client sends watch/subscribe with a namespace
  3. Server calls ws.subscribe("ns:{namespace}") (Bun’s native topic subscription)
  4. When a change occurs in that namespace, the FSM publishes a watch/change notification
  5. Bun routes the message to all subscribed sockets automatically
  6. On disconnect, all subscriptions are cleaned up via ws.unsubscribe()
The notification payload is a JSON-RPC notification (no id, no response expected):
{
  "jsonrpc": "2.0",
  "method": "watch/change",
  "params": {
    "seq": 42,
    "op": "set",
    "namespace": "tenant:acme/user:u1/preferences",
    "key": "theme",
    "value": "dark",
    "version": 3,
    "actor": "user:u1",
    "timestamp": 1710700000000,
    "tenantId": "acme"
  }
}

Write Path (Detailed)

A complete write operation flows through these stages:
  1. Client request arrives via HTTP PUT or WebSocket kv/set
  2. Leader check: if this node is not the leader, return a 307 redirect (HTTP) or -32001 Not Leader error (WebSocket) with the leader’s address
  3. Proposal: the leader serializes the command to JSON bytes and calls raft.propose(data)
  4. Log append: the entry is written to raft.db with synchronous=FULL
  5. Replication: AppendEntries RPCs are sent to all followers
  6. Quorum: once a majority acknowledges, commitIndex advances
  7. Apply: raft.onApply calls fsm.apply(), which writes to state.db
  8. Pub/sub: the FSM publishes the change event to Bun’s topic system
  9. Response: the pending promise resolves, and the client receives the result
For a 3-node cluster, a write requires acknowledgment from 2 of 3 nodes (the leader plus one follower).

Snapshots

When the Raft log grows beyond 10,000 entries (configurable via snapshotThreshold), the SnapshotManager creates a snapshot:
  1. store.serialize() produces the entire state.db as a Uint8Array
  2. The Raft log is compacted: all entries up to lastIncludedIndex are deleted from raft.db
  3. Snapshot metadata (lastIncludedIndex, lastIncludedTerm) is stored in memory
When a follower falls too far behind, the leader sends an InstallSnapshot notice followed by the binary snapshot data over WebSocket. The follower:
  1. Receives the JSON notice with metadata
  2. Receives the binary state.db data
  3. Writes the snapshot to disk and opens a new KvStore
  4. Compacts its Raft log up to lastIncludedIndex
For a configuration database (typically under 10MB), the entire snapshot fits in a single WebSocket binary message.

Single-Port Design

All traffic flows through one Bun.serve() listener (default :4100):
PathProtocolPurpose
/api/v1/*HTTPREST API for KV operations and cluster management
/streamWebSocketClient JSON-RPC (reads, writes, subscriptions)
/raftWebSocketPeer-to-peer Raft messages (JSON) and snapshots (binary)
This simplifies deployment: one port to expose, one TLS cert, one load balancer target. The /raft endpoint uses a nodeId query parameter to identify the connecting peer.