Don't Accept an Agent's Numbers and Citations As-Is — A Verification Gate Built on a Dedicated Auditor Subagent

A design that verifies every number and citation in an agent-generated summary using a separate subagent before accepting it — with working TypeScript for deterministic recomputation and fail-closed source matching.

Claude Code¹⁷⁵ subagents⁶ verification⁴ unattended automation Claude Agent SDK¹²

✦ Premium Article

When Claude Science was announced, the part that stayed with me wasn't the number of new skills. It was the multi-stage shape: a coordinating agent calls specialist agents, and then a dedicated agent verifies the citations and calculations. Treating verification as an independent role, separate from generation, felt like the important idea.

As an indie developer, I run automated jobs for several sites on my own. One morning a generated metrics summary said "+18% week over week," but when I added up the raw numbers by hand, the real figure was +8%. Models produce plausible numbers and plausible-looking citations with unsettling fluency. And when a summary quietly drifts from the underlying data, no one notices as long as they're only reading the summary. Since that morning I've distrusted the very structure of "trusting an artifact's numbers and citations inside the same flow that produced them."

This article builds the missing piece with working code: a pre-acceptance gate that separates generation from verification, recomputes numbers deterministically, matches citations against the source text, and rejects the entire artifact if even one claim fails.

Why you must not verify "as a continuation of generation"

The failure happens when you ask the generating model itself, "Is this correct?" Self-checking inside the same context and the same train of thought leads the model to treat the numbers it just produced as a correct premise, so it overlooks the drift. It's like proofreading your own draft, alone, right after writing it.

The value of an independent verifier comes down to three things.

First, context isolation. The verifier takes only the artifact and the primary data and sources as input; it inherits none of the generation-time reasoning. Second, deterministic judgment. Numbers are recomputed from raw data by a function, not re-confirmed by a language model. Third, fail-closed. A claim that cannot be verified is treated as failing, not as "probably fine," and a single unpassable claim stops the whole artifact.

Extract claims at the right granularity (a claim ledger)

You cannot verify a free-form summary directly. First, extract the smallest verifiable units — numeric claims and citation claims — as structured data. The generating agent must emit this "claim ledger" alongside the summary, every time.

// claims.ts — types for verifiable claims
export type NumericClaim = {
  id: string;
  kind: "number";
  statement: string; // human-readable claim (the spot in the summary)
  metric: string;    // metric key used for recomputation
  value: number;     // the value the model claimed
  tolerance?: number; // relative tolerance (default if omitted)
};
 
export type CitationClaim = {
  id: string;
  kind: "citation";
  statement: string;
  quote: string;     // string that must exist in the source
  sourceId: string;  // identifier of the source text to match against
};
 
export type Claim = NumericClaim | CitationClaim;
 
export type Artifact = {
  summary: string;
  claims: Claim[];
};

The key point is that value is stored as "the value the model claimed." The verifier does not trust this value; it later compares it against a figure it computes itself. The ledger isn't an appendix to the artifact — it is the input to verification.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦A number-verification function that recomputes each claimed value deterministically from raw data, with relative tolerance

✦A fail-closed citation check that normalizes quoted text and confirms it actually exists in the cited source

✦How to split work between a dedicated auditor subagent and deterministic checks, rejecting the whole artifact if any single claim fails

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Recompute numbers with a function, never ask the model

The whole point of numeric verification is to route it around the language model entirely. Register, per metric, a function that derives the answer uniquely from raw data, and compare it against the model's claim.

// verify-number.ts
import type { NumericClaim } from "./claims";
 
export type Dataset = Record<string, number[]>;
 
const sum = (xs: number[]) => xs.reduce((a, b) => a + b, 0);
 
// metric key -> function that determines the value from raw data
export const metricFns: Record<string, (d: Dataset) => number> = {
  "clicks.total": (d) => sum(d.clicks),
  "clicks.avgPerDay": (d) => sum(d.clicks) / d.clicks.length,
  "wowChangePct": (d) => {
    const prev = sum(d.clicksPrevWeek);
    const cur = sum(d.clicksThisWeek);
    return prev === 0 ? NaN : ((cur - prev) / prev) * 100;
  },
};
 
const DEFAULT_TOLERANCE = 0.005; // relative 0.5%
 
export function verifyNumber(
  claim: NumericClaim,
  data: Dataset
): { ok: boolean; reason: string; expected?: number } {
  const fn = metricFns[claim.metric];
  if (!fn) {
    // an unknown metric is "unverifiable" = fail (fail-closed)
    return { ok: false, reason: `unregistered metric: ${claim.metric}` };
  }
  const expected = fn(data);
  if (!Number.isFinite(expected)) {
    return { ok: false, reason: "recomputed value is not finite", expected };
  }
  const tol = claim.tolerance ?? DEFAULT_TOLERANCE;
  const denom = Math.abs(expected) || 1;
  const relErr = Math.abs(claim.value - expected) / denom;
  return relErr <= tol
    ? { ok: true, reason: "match", expected }
    : {
        ok: false,
        reason: `claimed ${claim.value} vs recomputed ${expected.toFixed(2)} differ by ${(relErr * 100).toFixed(2)}%`,
        expected,
      };
}

The "+18% that was really +8%" from the opening fails instantly here: recomputing wowChangePct catches it. The relative tolerance exists so that rounding and display-digit differences don't fail; but the default is deliberately narrow at 0.5%, so any meaningful drift is caught.

Returning ok: false for an unregistered metric is the heart of fail-closed. You lean toward "cannot verify = do not pass," not "no verifier function = safe."

Match citations by checking the quote exists in the source

Citation verification, too, must not ask the model "is this quote correct?" Instead, confirm — after normalization — that the quoted string actually exists inside the designated source snippet.

// verify-citation.ts
import type { CitationClaim } from "./claims";
 
export type Sources = Record<string, string>; // sourceId -> source snippet
 
// absorb width, whitespace, and punctuation variance
function normalize(s: string): string {
  return s
    .normalize("NFKC")
    .replace(/\s+/g, "")
    .replace(/[「」『』()（）,.、。]/g, "")
    .toLowerCase();
}
 
export function verifyCitation(
  claim: CitationClaim,
  sources: Sources
): { ok: boolean; reason: string } {
  const src = sources[claim.sourceId];
  if (!src) {
    return { ok: false, reason: `source not found: ${claim.sourceId}` };
  }
  const q = normalize(claim.quote);
  if (q.length < 8) {
    // a too-short quote matches by chance, so fail it
    return { ok: false, reason: "quote too short to match" };
  }
  return normalize(src).includes(q)
    ? { ok: true, reason: "matches source" }
    : { ok: false, reason: "quote does not exist in source" };
}

Normalization is there because models tend to alter punctuation and bracket widths slightly when quoting. On the other hand, a too-short quote (under 8 characters) would hit almost any source by coincidence, so it fails on purpose. Again: "cannot match = do not pass."

Use the auditor subagent only where determinism can't reach

Numbers and citations can be crushed deterministically as above, but "did the summary quietly slip in a number or source that isn't in the ledger?" is hard to measure with string matching alone. Hand only that part to a dedicated auditor subagent. Narrowing its role minimizes the surface you leave to model judgment.

In Claude Code, place a dedicated auditor subagent under .claude/agents/ and invoke it as an independent process that carries none of the generation context.

<!-- .claude/agents/claim-auditor.md -->
---
name: claim-auditor
description: Judges only whether a generated summary introduced numbers/citations absent from its claim ledger
tools: []
---
You are a dedicated auditor. You receive only the "summary text" and the "claim ledger" (a list of ids and statements).
List every number or specific citation appearing in the summary that maps to no claim in the ledger.
Judge conservatively: if even one number/citation lacks ledger support, report it as unverified.
Output only JSON of shape {"unlisted": string[]}. Do not infer or fill in the generator's intent.

If you wire this from the SDK instead, the key is to restrict the input to just the summary and the ledger — never pass the raw data or the generation-time prompt.

// audit-summary.ts (excerpt)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
 
export async function auditSummary(
  summary: string,
  claimStatements: string[]
): Promise<{ unlisted: string[] }> {
  const msg = await client.messages.create({
    model: "claude-haiku-4-5-20251001", // a light model is enough for auditing
    max_tokens: 512,
    system:
      'List numbers/citations in the summary that map to none of the given claims. Output only JSON {"unlisted": string[]}.',
    messages: [
      {
        role: "user",
        content: `# Summary\n${summary}\n\n# Claims\n${claimStatements.join("\n")}`,
      },
    ],
  });
  const text = msg.content.find((b) => b.type === "text");
  return JSON.parse(text && "text" in text ? text.text : '{"unlisted":[]}');
}

A light model suffices because this audit isn't a fresh judgment — it's a diff against the ledger.

Bundle everything into one gate (Before / After)

Finally, combine number checks, citation checks, and the leakage audit into a function that rejects the whole artifact if any one of them fails. First, the tempting "publish as-is" version.

// ❌ Before: accept the artifact's summary and publish it directly
const artifact = await generateReport(data);
await publish(artifact.summary); // no one verified the numbers or citations

Change it so that only artifacts that pass verification ever reach publish.

// ✅ After: always go through a pre-acceptance gate
import { verifyNumber } from "./verify-number";
import { verifyCitation } from "./verify-citation";
import { auditSummary } from "./audit-summary";
import type { Artifact, Dataset, Sources } from "./types";
 
export async function gate(
  artifact: Artifact,
  data: Dataset,
  sources: Sources
): Promise<{ ok: boolean; failures: string[] }> {
  const failures: string[] = [];
 
  for (const c of artifact.claims) {
    if (c.kind === "number") {
      const r = verifyNumber(c, data);
      if (!r.ok) failures.push(`[${c.id}] ${r.reason}`);
    } else {
      const r = verifyCitation(c, sources);
      if (!r.ok) failures.push(`[${c.id}] ${r.reason}`);
    }
  }
 
  const audit = await auditSummary(
    artifact.summary,
    artifact.claims.map((c) => `${c.id}: ${c.statement}`)
  );
  for (const u of audit.unlisted) {
    failures.push(`[unlisted] not in ledger: ${u}`);
  }
 
  return { ok: failures.length === 0, failures };
}
 
// caller
const artifact = await generateReport(data);
const result = await gate(artifact, data, sources);
if (!result.ok) {
  console.error("rejected:", result.failures);
  // do not publish. log the reasons and route to regeneration or human review
} else {
  await publish(artifact.summary);
}

The difference between Before and After isn't a single added line — it's the inversion of the premise: "an artifact is not trusted by default." publish is only ever called from an artifact that passed the gate.

Small judgments that paid off in practice

After running this shape for a few months, what helped wasn't the flashy machinery but the small decisions in the details.

Decision point	Policy taken	Reason
Unregistered metric / missing source	Fail it	Reading "unverifiable" as "safe" lets incidents slip through
Numeric tolerance	Narrow, relative 0.5%	Forgive rounding, always catch meaningful drift
Short quotes	Under 8 chars fails	Don't let a chance match read as "source exists"
Audit model	Run on a light model	Diff detection needs no heavy reasoning
On rejection	Log the reasons and stop	To read failure trends later; never discard silently

The rejection log, in particular, later tells you which metrics drift most often. In my case, difference metrics like week-over-week were the most error-prone, and tightening the tolerance just there let me catch nearly all of the quiet drift.

Speed and fluency of generation guarantee nothing about correctness. That's exactly why you place, somewhere apart from generation, a plain checkpoint: recompute the numbers, go back to the source for citations, and pass nothing that lacks support. It isn't glamorous, but the longer I run things unattended, the more this single gate has saved me. I hope it gives fellow builders handling agent outputs a foothold for their own design.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.