/docs/concepts/clusters

Clusters

obsrv groups traces continuously by behavioral similarity. Clusters surface emerging failure modes without you having to define a category list up front.

How clusters are formed

  1. An embedding worker extracts a structured text representation from each trace.
  2. The text is embedded to 1536 dimensions and stored alongside the trace ID.
  3. k-means with cosine distance runs on a moving window per project.
  4. Cluster names are generated from representative traces using Claude.

We picked k-means because it's deterministic, fast, and operable at trace volumes. We pick k via silhouette score, capped between 4 and 32 per project.

What you can do with a cluster

  • Drill from a cluster to its underlying traces and replay them.
  • Filter the trace feed by cluster ID.
  • Set a monitor on cluster volume to alert when a failure mode grows.
  • Export cluster membership via the API.

Limits

Clusters are per-project— they don't mix tenants. Cluster discovery is available on every plan with a per-tier window size. Enterprise customers can pin a custom embedding model.