Every completed inference response is persisted to S3. If a client disconnects mid-stream, the provider continues to completion and the response is stored for later retrieval.

GET /api/v1/status/

Check the status of a request.

Headers

HeaderRequiredDescription
AuthorizationYesSame API key that created the request

Response

{
  "request_id": "req_abc123",
  "status": "completed",
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:05Z",
  "model": "claude-sonnet-4-20250514",
  "provider": "anthropic"
}
Status values: pending, streaming, completed, failed

GET /api/v1/retrieve/

Retrieve a completed response.

Headers

HeaderRequiredDescription
AuthorizationYesSame API key that created the request

Response

Returns the full InferenceResponse — identical to what would have been received via streaming.
{
  "id": "req_abc123",
  "model": "claude-sonnet-4-20250514",
  "provider": "anthropic",
  "blocks": [
    {"type": "text", "text": "The complete response..."}
  ],
  "input_tokens": 1250,
  "output_tokens": 3400,
  "cost": "0.055",
  "finish_reason": "end_turn"
}

Security

  • Authenticated retrieval: Only the same API key that created the request can retrieve the response
  • No lost inference: Provider API calls are charged at creation — the response is always persisted even if the client disconnects
  • S3 storage is organized by date and request ID for efficient retrieval

Client Library Support

All client libraries provide recovery methods:
// TypeScript
const status = await client.getStatus('req_abc123');
const response = await client.retrieve('req_abc123');
# Python
status = await client.get_status('req_abc123')
response = await client.retrieve('req_abc123')
// Go
status, _ := client.GetStatus(ctx, "req_abc123")
response, _ := client.Retrieve(ctx, "req_abc123")