What does STEADYWRK Research publish?

STEADYWRK Lab publishes six research areas: Public Evals (operational AI metrics on a rolling 30-day window), System Card (agentic control-plane architecture and safety posture), What Is an Agentic Control Plane (first-principles explainer), Proof of Autonomous Operations (traceable record of real dispatches and outcomes), AI Policy Position (accountability, transparency, human escalation requirements), and EU AI Act Alignment analysis.

How can I verify STEADYWRK research claims?

Every STEADYWRK publication links to a verifiable source — a public no-auth JSON endpoint, an audit log, or a live system card. The evals page and the API endpoint always agree. Struck claims are logged openly on the proof page alongside what replaced them.

What is the STEADYWRK publication doctrine?

STEADYWRK only publishes research about systems that are live and operating. Unverifiable claims are not published. When a published figure is retired, the retraction is logged publicly. Production before announcement — not the reverse.

STEADYWRK LAB · Research Publications

What the lab builds, the lab publishes.

ما يبنيه المختبر، ينشره.

STEADYWRK Lab operates live AI systems from Aqaba and publishes the research that emerges from running them: evaluation methodology, agent architecture, regulatory alignment, and operational proof. Every publication links to a verifiable source — a JSON endpoint, an audit log, or a live system card.

Published work

6 publications. All from production.

Evaluation Methodology

Public Evals — Operational AI in Production

STEADYWRK publishes eight evals on a rolling 30-day window: completion rate, NTE variance, quote turnaround, dispatch latency (p50 and p95), human override rate, policy-violation catch rate, and cost-per-decision posture. Ground truth is contractor outcome and payment disposition. Served from a public no-auth JSON endpoint — the page and the API cannot disagree.

Read the evals

Architecture

System Card — Agentic Control Plane

The STEADYWRK system card documents the agent topology, decision-routing logic, safety posture, and the human-in-the-loop design. Every agent operates within a defined confidence envelope; anything below 70% escalates to a human operator. The card is the accountability document for the full stack.

Read the system card

Infrastructure Research

What Is an Agentic Control Plane?

A first-principles explanation of how multi-agent systems coordinate, route, and account for decisions at production scale — written from the operational experience of running one. The control plane is the layer between individual agents and the decisions that touch the real world.

Read the explainer

Proof of Work

Proof: Autonomous Operations

The STEADYWRK autonomous-operations proof documents what the system has done, how it was verified, and where the human-audit trail lives. Not a claims page — a traceable record of real dispatches, real outcomes, and real exceptions.

Read the proof

Policy

AI Policy — STEADYWRK Position

STEADYWRK publishes its position on AI policy: accountability for autonomous decisions, transparency in evals, human escalation requirements, and data handling. Written for operators and regulators — not for press releases.

Read the position

Regulatory Alignment

EU AI Act — STEADYWRK Alignment

An analysis of how the STEADYWRK control plane maps to EU AI Act obligations: risk classification, human oversight requirements, transparency obligations, and audit-trail standards. Published openly because the obligations apply to any system deployed in or for European operations.

Read the alignment

Publication doctrine

A number without a source is a claim.

Verifiable or not published.

Every metric STEADYWRK publishes links to its ground truth — a JSON endpoint, an audit log, or an operational timestamp. Unverifiable claims are not published, regardless of how good they look.

Production before announcement.

STEADYWRK does not publish research about systems that are not running. Every paper in this index describes something that is live and operating.

Retraction on record.

When a published figure is retired, the retraction is logged publicly. The proof page lists every struck claim alongside what replaced it.

Start with the evals. Then read the system card.

Public evals AI Lab