One source. Every public metric reads from a single canonical registry (apps/web/src/data/canonical-metrics.ts, v2.0.0). The same registry feeds this page, the live /evals dashboard, and the public JSON endpoints — so the prose and the data cannot disagree.
Consistency, not certification.What is verifiable here is that every number traces to one source and the build breaks on drift. The eval figures are self-reported and estimated from STEADYWRK’s production telemetry — not an independently audited third-party benchmark. We publish the discipline honestly; we do not claim an auditor signed off on the values.
Dated and versioned. This page is current as of June 2, 2026. When a metric’s definition changes, the registry version (v2.0.0) changes with it, so a citation can pin exactly what it cited.
Source is private; surfaces are public. The repository is private, so source files are named here as paths, not links. What is public and re-checkable are the live surfaces: the audit log, the health probe, and the evals endpoint.