Failure Clustering & Triage

When a build goes red with dozens of failures, they often share a handful of root causes — one NullPointerException, one timeout, one OOM. UniTrack groups similar failures into clusters and lets you triage them with rules so the noise collapses into a few actionable buckets.

1. Clustering

After ingest, recent failures are compared by their error signature (exception type and message) and grouped. The Clusters dashboard page shows each group with its representative error and the tests that hit it, so you can tell "20 tests failed" from "20 tests hit the same one bug".

2. Triage rules

A TriageRule matches failures by pattern and assigns a category, turning recurring, recognisable failures into labelled, owned buckets automatically. Rules are managed on the Triage dashboard page.

Typical uses:

  • tag any OutOfMemoryError as an infrastructure failure;

  • route timeouts in a known-slow suite to a flaky/perf category;

  • label assertion failures in a module as product bug.

As new runs arrive, their failures are matched against the rules and categorised on the way in, so the Clusters and run views stay organised without manual sorting.

3. Test ownership

Per-project owner rules (managed on the Owners page, /projects/{id}/owners) map a test class-name pattern (regex) to an owner — a team or handle. Lower priority wins; the first match assigns the owner. Each failing test on a run page shows its owner, so "whose failure is this?" is answered at a glance. (Routing owner-scoped notifications to chat is a separate, planned piece.)

4. AI root-cause

For a cluster spanning more than one test, an Analyze with AI button asks an LLM for a likely root cause and fix direction. It is bring-your-own-key and off by default — see AI Root-Cause Analysis.

5. Working the queue

  1. Open Clusters after a red build to see the distinct root causes.

  2. For a cause you recognise, add a Triage rule so future occurrences self-categorise.

  3. Cross-reference flaky tests — a cluster that only appears intermittently on a fixed commit is likely flakiness, not a real regression.