/docs/evals/release-tagging

Release tagging

Tag every trace with the release, prompt version, and feature flag context that produced it. obsrv uses these to roll up pass rates and detect regressions per release.

Tagging

t.set_metadata(
    release=os.environ.get("RELEASE", "dev"),
    prompt_version="checkout-prompt-v7",
    feature_flags={ "new_planner": True },
)

Comparing releases

Open the metric view, group by release, and obsrv will chart pass rates across the last N releases. Click a delta to see the underlying traces that changed.

Regression detection

If a metric's pass rate drops by more than a configurable threshold between two releases, obsrv opens a regression candidate with a sample of the traces involved. You can resolve, suppress, or promote it to a tracked incident.