Development

Diminuendo auto-spawns a single-node Concordance:
cd diminuendo
bun run dev
# Internally spawns: bun run concordance/service/main.ts --bootstrap --listen=127.0.0.1:4100
Same binary, same API, same code path as production. The only difference is ConcordanceProcess spawning a child process vs being a no-op.

Production (3-Node Raft Cluster)

# Node 1 (bootstrap as initial leader)
concordance --node-id=node1 --listen=0.0.0.0:4100 \
  --data-dir=/var/lib/concordance --bootstrap \
  --api-key=$CONCORDANCE_API_KEY --agent-key=$CONCORDANCE_AGENT_API_KEY

# Node 2 (join cluster)
concordance --node-id=node2 --listen=0.0.0.0:4100 \
  --data-dir=/var/lib/concordance --join=node1:4100 \
  --api-key=$CONCORDANCE_API_KEY --agent-key=$CONCORDANCE_AGENT_API_KEY

# Node 3 (join cluster)
concordance --node-id=node3 --listen=0.0.0.0:4100 \
  --data-dir=/var/lib/concordance --join=node1:4100 \
  --api-key=$CONCORDANCE_API_KEY --agent-key=$CONCORDANCE_AGENT_API_KEY

Build Standalone Binary

bun build concordance/service/main.ts --compile --outfile=concordance

Rolling Upgrade

Quorum (2 of 3) is maintained throughout:
  1. Remove node 3: POST /api/v1/cluster/remove {"nodeId": "node3"}
  2. Stop node 3, upgrade binary, restart with --join=node1:4100
  3. Repeat for node 2
  4. Transfer leadership away from node 1, then upgrade node 1

Replace Failed Node

# Remove dead node from cluster
curl -X POST http://leader:4100/api/v1/cluster/remove \
  -H "Authorization: Bearer $CONCORDANCE_API_KEY" \
  -d '{"nodeId": "node2"}'

# Start replacement with new data dir
concordance --node-id=node2-new --listen=0.0.0.0:4100 \
  --data-dir=/var/lib/concordance-new --join=node1:4100
The new node receives a snapshot + log catch-up automatically.

Failure Behavior

ScenarioImpactRecovery
1 node downReads continue, writes continue (quorum intact)New leader elected in under 3s
2 nodes downCluster loses quorum, writes failRestart nodes; auto-rejoin
Network partitionMajority side continues; minority read-onlyAuto-heals on reconnect
Leader crashUncommitted writes lostFollowers elect new leader; clients retry

Health Monitoring

# Quick health check
curl http://node1:4100/api/v1/health

# Cluster status (role, leader, term, peers)
curl http://node1:4100/api/v1/cluster/status

# Ready check (leader elected?)
curl http://node1:4100/api/v1/ready