OpenTelemetry Tracing
Ensemble provides production-grade distributed tracing:Span Hierarchy
Span Attributes
Each span includes:| Attribute | Description |
|---|---|
ensemble.model | Requested model name |
ensemble.provider | Selected provider |
ensemble.endpoint | Selected endpoint ID |
ensemble.session_id | Session ID (for affinity tracking) |
ensemble.cost | Request cost (decimal) |
ensemble.input_tokens | Input token count |
ensemble.output_tokens | Output token count |
ensemble.cache_hit | Whether cache was used |
ensemble.routing.reason | Routing decision reason |
ensemble.error.class | Error classification (rate_limit, permanent, retryable) |
W3C Trace Context
Ensemble propagates W3Ctraceparent and tracestate headers from clients (e.g., Langfuse agents) through to provider calls, enabling end-to-end trace correlation.
Metrics
| Metric | Type | Labels |
|---|---|---|
ensemble.requests.total | Counter | status, provider, model, error_class |
ensemble.request.duration | Histogram | provider, model |
ensemble.tokens.input | Counter | provider, model |
ensemble.tokens.output | Counter | provider, model |
ensemble.cost.total | Counter | provider, model |
ensemble.cache.hit_rate | Gauge | provider |
ensemble.ratelimit.utilization | Gauge | endpoint |
Logging
Async batched logger with:- 65,536 message ring buffer (non-blocking)
- Structured JSON output
- Request ID correlation on every log line
- Configurable log levels
- SQLite-backed log persistence for admin API queries