/docs/concepts/metrics
Metrics
Outcome scores attached to traces. Use them to capture both synthetic evaluations and observed user signals.
Shape of a metric
A metric has a name, a kind (boolean, numeric,enum), a value, and an associated trace ID. Optional: per-step ID for finer-grained scoring.
Recording metrics
client.record_metric(
"task_adherence",
"tr_…",
passed=True,
)
client.record_metric(
"response_latency_ms",
"tr_…",
score=1840,
)Synthetic vs observed
Synthetic metrics come from automated evals (LLM judges, deterministic rules, rubrics). They run on a schedule or inline.
Observed metrics come from real signals — user thumbs-down, refunds, escalations, conversation reopened, item purchased, etc. obsrv treats both kinds the same way for storage, rollup, and monitoring.
Pass rates & rollups
Boolean metrics roll up into pass rates per release, per cluster, or per metadata facet. Numeric metrics roll up to p50/p95/p99. The dashboard renders both natively.