Science

Five domains. Same direction of effect.

We measured the layer before we named it. Across five high-stakes professional work products, deployment-time fidelity treatment drives measured fabricated propositions toward the floor — and output quality rises rather than falls.

The evidence board

Fabricated propositions, before and after treatment.

Domain Baseline After fidelity layer Independent check
Private-equity IC memos ~15% 0.3% Numeric reconciliation vs. source packet
Legal IRAC memos 31–61%* 0.2–3.5% Legal-research oracle verified quotes
Medical evidence synthesis ~17% 0.7% ClinicalTrials.gov / NCBI registries
Cyber vulnerability advisories ~12% 0.3% NVD / CISA KEV — zero phantom CVEs
Patent prior-art analysis ~7% 0.2% USPTO records — zero phantom patents

*Baseline rates are judge-dependent. The treated floor is the robust claim. Figures are research findings for public orientation, not product guarantees; protocols and per-arm boards are shared in qualified review contexts.

Stress-tested, on purpose The legal slice was re-scored under five independent judge configurations across four generator families. Every judge saw the same direction of effect, with a blinded human review slice and non-LLM oracles corroborating the treated floor.
What we do not claim This is not a claim that hallucination is solved or that any output is guaranteed. A certification layer that will not state the limits of its own measurements is not a certification layer.

Research briefing

For serious technical and strategic readers.

Deeper methodology and evidence — protocols, per-arm boards, and the cross-judge normalization work — are shared in qualified strategic, enterprise, or research contexts.

Request research briefing