Instance Creation

1. Client → Gateway: POST /api/v1/instances {deployment_id, config}
2. Gateway → Valkey: Find least-loaded coordinator
3. Gateway → Coordinator: POST /instances {deployment_id, secrets}
4. Coordinator:
   a. ensure_deployment() — download from S3 if missing
   b. Create instance directory structure
   c. Mount Chronicle FUSE filesystem
   d. Spawn agent process (Bun for TS, PyO3 shim for Python)
   e. Initialize IPC channel (Unix socket)
5. Coordinator → Gateway: {instance_id, agent_id}

State Machine

Agent instances follow a lifecycle state machine:
         ┌──────────────────┐
         │   Deploying       │
         └────────┬─────────┘

         ┌────────▼─────────┐
         │   Initializing    │  onInit() called
         └────────┬─────────┘

         ┌────────▼─────────┐
         │      Ready        │  Waiting for messages
         └────────┬─────────┘
                  │ message arrives
         ┌────────▼─────────┐
         │     Running       │  onMessage() executing
         └────────┬─────────┘
                  │ response complete

              Ready (loop)
                  │ idle timeout
         ┌────────▼─────────┐
         │   Scaling Down    │  onIdleTimeout() called
         └────────┬─────────┘

         ┌────────▼─────────┐
         │   Shutting Down   │  onShutdown() called, state checkpointed
         └────────┬─────────┘

         ┌────────▼─────────┐
         │    Terminated     │  Process exited, Chronicle unmounted
         └──────────────────┘

IPC Protocol

Communication between the coordinator and agent processes uses Unix domain sockets with length-prefixed JSON messages.

Message Types (Coordinator → Agent)

TypeDescription
initInitialize with config, state, workspace path
messageDeliver user message for processing
config_updateRuntime config change
steerMid-turn guidance
shutdownGraceful shutdown signal

Message Types (Agent → Coordinator)

TypeDescription
stream_updateStreaming token/content update
state_deltaState mutation delta (JSON Patch)
logStructured log entry
tool_requestRequest tool execution
completeMessage processing complete

Agent Configuration

Agents receive configuration at init time and can receive runtime updates:
{
  "agent_id": "agent_abc123",
  "agent_type": "claude-agent",
  "workspace": "/instances/agent_abc123/chronicle/mount",
  "ensemble_url": "http://ensemble:8080",
  "ensemble_api_key": "ens_...",
  "model": "claude-sonnet-4-20250514",
  "custom_config": { ... }
}

Scale-to-Zero

When an agent is idle beyond the configured timeout:
  1. Coordinator calls onIdleTimeout() — agent can return false to defer
  2. If allowed, coordinator calls onShutdown()
  3. State is checkpointed to SQLite
  4. SQLite is replicated to S3 via Litestream
  5. Chronicle FUSE filesystem is unmounted
  6. Agent process is terminated
  7. Instance directory is cleaned up (or preserved for fast restart)
On next request, the agent is rehydrated from S3 state.