Science

Five domains. Same direction of effect.

We measured the layer before we named it. Across five high-stakes professional work products, deployment-time fidelity treatment drives measured fabricated propositions toward the floor — and output quality rises rather than falls.

The evidence board

Fabricated propositions, before and after treatment.

Domain	Baseline	After fidelity layer	Independent check
Private-equity IC memos	~15%	0.3%	Numeric reconciliation vs. source packet
Legal IRAC memos	31–61%*	0.2–3.5%	Legal-research oracle verified quotes
Medical evidence synthesis	~17%	0.7%	ClinicalTrials.gov / NCBI registries
Cyber vulnerability advisories	~12%	0.3%	NVD / CISA KEV — zero phantom CVEs
Patent prior-art analysis	~7%	0.2%	USPTO records — zero phantom patents

*Baseline rates are judge-dependent. The treated floor is the robust claim. Figures are research findings for public orientation, not product guarantees; protocols and per-arm boards are shared in qualified review contexts.

Stress-tested, on purpose The legal slice was re-scored under five independent judge configurations across four generator families. Every judge saw the same direction of effect, with a blinded human review slice and non-LLM oracles corroborating the treated floor.

What we do not claim This is not a claim that hallucination is solved or that any output is guaranteed. A certification layer that will not state the limits of its own measurements is not a certification layer.

Research briefing

For serious technical and strategic readers.

Deeper methodology and evidence — protocols, per-arm boards, and the cross-judge normalization work — are shared in qualified strategic, enterprise, or research contexts.

Request research briefing