Evaluation Methodology
Public Evals — Operational AI in Production
STEADYWRK publishes eight evals on a rolling 30-day window: completion rate, NTE variance, quote turnaround, dispatch latency (p50 and p95), human override rate, policy-violation catch rate, and cost-per-decision posture. Ground truth is contractor outcome and payment disposition. Served from a public no-auth JSON endpoint — the page and the API cannot disagree.
Read the evals