⬡ API & SDK/2026-07-01Intermediate

A Fail-Closed Model Pricing Registry So New Models Don't Quietly Break Your Cost Math

When Opus 4.8 and Haiku 4.5 landed in the Messages API, rates scattered across my code silently skewed the cost rollup. Here is how to centralize per-model rates and fail closed on unknown models, with complete working code.

Claude API⁹⁶ Cost Management⁴ Model Pricing Operations⁷ TypeScript¹⁸

✦ Premium Article

The morning after Opus 4.8 and Haiku 4.5 arrived in the Messages API, my overnight cost rollup was quietly wrong. Not a single error was logged. The day-over-day chart just looked oddly low.

The cause was simple. Requests to the new model IDs succeeded and were billed correctly, but my aggregation script's rate table had no entry for those IDs, so that traffic slid through priced at zero. A crash would have been kinder; the worst failure is the one that under-reports in silence.

As an indie developer running automated posting across several sites, a new model is something I want to try and something that comes to break my accounting at the same time. Here is how to stop that particular failure for good, with code you can run.

Why Scattered Rates Break

When I first wrote this, I had if (model === "...") rate = ... inlined in every place that needed a cost: the aggregation script, the monthly billing reconcile, and the dashboard. Three copies.

That shape causes no trouble at all while the model set is fixed. It breaks at exactly one moment: when a new model ID appears. And breaking at that moment is the worst possible timing, because you usually try a new model precisely when you are watching costs closely.

The problems with scattered rates fall into three buckets.

Symptom	What happens	How hidden
Unknown model priced at 0	New-model traffic aggregates as zero cost	Very high — no exception is raised
Partial update	One of three copies keeps the old rate	Totals disagree by location
Cache ignored	Cache read/write priced at the input rate	Error grows the more cache you use

What they share is that no exception ever fires. That is exactly why only design can prevent them.

Centralize Rates in One Place

First, make a single source of truth for rates. The key idea is to store a structure keyed by model ID rather than hardcoding headline prices. The rates themselves change; the structure stays stable.

// pricing.ts — the source of truth. No rate is written anywhere else.
// Values are USD per million tokens. Always fill from the official page.
export interface ModelRate {
  input: number;        // input tokens
  output: number;       // output tokens
  cacheWrite5m: number; // 5-minute cache write (~1.25x input)
  cacheWrite1h: number; // 1-hour cache write (~2x input)
  cacheRead: number;    // cache read (~0.1x input)
}
 
// Fill these from the current official pricing page before relying on them.
// The point is to never paste "numbers from memory" here, since rates move.
const RATES: Record<string, ModelRate> = {
  "claude-opus-4-8": rate(15.0, 75.0),
  "claude-sonnet-4-6": rate(3.0, 15.0),
  "claude-haiku-4-5": rate(/* input */ 0, /* output */ 0), // TODO: official values
};
 
// Derive cache rates from input/output via fixed multipliers.
// The multipliers (5m 1.25x, 1h 2x, read 0.1x) stay stable as prices change.
function rate(input: number, output: number): ModelRate {
  return {
    input,
    output,
    cacheWrite5m: input * 1.25,
    cacheWrite1h: input * 2.0,
    cacheRead: input * 0.1,
  };
}
 
export { RATES };

The rate() helper is deliberate. Get the input price right and the cache write/read prices fall out automatically. I once updated the input price and forgot the cache prices, so leaning on derivation gives me one less thing to forget.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦A model-id-keyed pricing registry plus a complete TypeScript cost calculator that reads the usage block, cache tokens included

✦A fail-closed design that refuses to score unknown models as zero, and a CI test that turns a new model launch into a red build instead of an incident

✦Cost attribution that accounts for 5-minute and 1-hour prompt cache multipliers, with the calls that paid off in my own automated posting setup

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Compute Real Cost From the usage Block

The response usage carries the exact token breakdown your bill is based on. Run it through the registry and you get measured cost, not a guess.

// cost.ts — turn usage plus rates into the cost of one request
import { RATES, ModelRate } from "./pricing";
 
interface Usage {
  input_tokens: number;
  output_tokens: number;
  cache_creation_input_tokens?: number; // cache writes
  cache_read_input_tokens?: number;     // cache reads
}
 
const PER = 1_000_000; // divide to per-token
 
export function costOf(model: string, usage: Usage, cacheTtl: "5m" | "1h" = "5m"): number {
  const r: ModelRate | undefined = RATES[model];
  if (!r) {
    // The crux: do not score unknown models as 0 (implemented next).
    throw new UnknownModelError(model);
  }
  const write = cacheTtl === "1h" ? r.cacheWrite1h : r.cacheWrite5m;
  return (
    (usage.input_tokens * r.input +
      usage.output_tokens * r.output +
      (usage.cache_creation_input_tokens ?? 0) * write +
      (usage.cache_read_input_tokens ?? 0) * r.cacheRead) /
    PER
  );
}

Pricing cache_creation_input_tokens and cache_read_input_tokens at the input rate inflates the error the more caching you do. Reads are about a tenth of input, so splitting them out cleanly was enough to visibly tighten my rollup.

Don't Let Unknown Model IDs Pass at Zero

This was the heart of the first incident: make an unknown model an exception, not zero dollars. Fail closed. Push the cost to the safe side — the side you can detect, not the side that hides.

export class UnknownModelError extends Error {
  constructor(public readonly model: string) {
    super(`Unpriced model: ${model} — add it to pricing.ts`);
    this.name = "UnknownModelError";
  }
}
 
// How the aggregation loop absorbs it: one unknown model must not halt
// the whole job, but it must always surface (never be swallowed).
export function aggregate(records: { model: string; usage: Usage }[]) {
  let total = 0;
  const unpriced = new Set<string>();
  for (const rec of records) {
    try {
      total += costOf(rec.model, rec.usage);
    } catch (e) {
      if (e instanceof UnknownModelError) {
        unpriced.add(e.model); // not passed through as 0 — visible later
        continue;
      }
      throw e;
    }
  }
  return { total, unpriced: [...unpriced] };
}

What I care about here is landing between "halt" and "swallow." Crashing an overnight job over a single unknown model is too much. But passing it through at zero takes you right back to the original incident. Returning unpriced alongside the total keeps the state I want: the numbers come out, and the gaps come out with them.

A Test That Turns a New Model Into a Red Build

Once design prevents the failure, build the state where the next new model gets noticed. Collect model IDs from real logs and keep one light test that fails if any of them is missing from the registry.

// pricing.test.ts — every model seen in real logs must be priced
import { RATES } from "./pricing";
 
// Model IDs pulled from recent request logs (auto-refreshed in ops)
import { observedModels } from "./fixtures/observed-models";
 
test("every observed model exists in the pricing registry", () => {
  const missing = observedModels.filter((m) => !(m in RATES));
  expect(missing).toEqual([]); // one unpriced model turns CI red
});
 
test("rates are positive and output is at least input", () => {
  for (const [model, r] of Object.entries(RATES)) {
    expect(r.input, model).toBeGreaterThan(0);
    expect(r.output, model).toBeGreaterThanOrEqual(r.input);
  }
});

The second test looks trivial but earns its keep. It is how I caught the haiku-4-5 row I had left at 0 under a TODO. The plain invariant "output should not be cheaper than input" reliably catches copy-paste misses during a rate update.

How Far It Drifts, in Numbers

Abstractions get deprioritized, so here is the order of magnitude. Say one overnight job uses 500k input and 80k output tokens, and 30% of that runs on a newly added, unpriced model. If the unpriced portion slides through at zero, that run reports roughly 30% under its true cost.

In my case, what hurt was less the dollar figure than the broken day-over-day comparison. On a day I tried a new model, usage actually went up, yet the chart showed it going down. When the sign of the change flips, people stop noticing the anomaly. Once I failed closed and surfaced unpriced models, a non-empty unpriced became the alert itself, and I could go straight to the cause instead of second-guessing the numbers.

Small Calls That Paid Off

A few decisions around pricing don't show up in any number but made operations easier.

One is keeping aliases (moving names like claude-3-5-haiku-latest) out of aggregation. The billed response returns a resolved, concrete model ID, so I made that the source of truth. The rule is simple: reconcile against the ID the response declares, not what you asked for at request time.

The other is treating a rate update as a code change with history. Rates move for external reasons, and if the change leaves no record, a later recomputation of past cost will not reconcile. I commit pricing.ts edits with an effective-date comment and compute historical traffic at the rate in force then.

Decision	Reason	Incident avoided
Reconcile on the response's real model ID	Aliases move their target	Old/new model mix-ups
Commit rate changes with an effective date	Rates move for external reasons	Mismatched historical recompute
Fail closed plus surface it	Pass-through zero is the real danger	Silent under-reporting

Your Next Step

If your rates are inlined all over the code today, start by creating a single pricing.ts and centralizing on the response's model field as the key. Just routing everything through one costOf that throws on unknown models means that the next morning a model appears, CI goes red and tells you, instead of the chart drifting in silence.

I am still fixing this as I run it, but the one principle that always holds for cost math is this: if it has to break, let it break visibly rather than quietly. Thank you for reading.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.