ARTICLES

All Articles

All (786)◉ Claude AI (133)⟐ Claude Code (320)◈ Cowork (54)⬡ API & SDK (279)

⬡ API & SDK/2026-06-18Advanced

When Your Claude API Response Cache Returns Stale Answers and Near-Miss Wrong Ones — Field Notes on Freshness and False-Hit Suppression

A Claude API response cache improves latency and cost immediately, but the problems that hurt in production are not average hit rate — they are stale hits and semantic false hits. Here is the key design, freshness management, false-hit suppression, and observability that keep a cache honest.

⬡ API & SDK/2026-06-17Advanced

When Claude API Extracts the Wrong Value With Full Confidence — Designing the Verification Layer

When you extract invoices or contracts with Claude API, the scariest failure isn't an exception — it's plausible-but-wrong JSON. Here is how I build a verification layer that catches silent extraction errors with schema checks, arithmetic reconciliation, and dual-extraction agreement, in TypeScript.

⬡ API & SDK/2026-06-17Advanced

Stop Terminology Drift in Localized Apps: A Consistent Localizable.strings Pipeline with the Batch API and a Cached Glossary

Translating UI strings one at a time invites inconsistency. Pair Claude's Message Batches API with a prompt-cached glossary to translate Localizable.strings across 10+ languages consistently, with measured costs and the pitfalls I hit in production.

⬡ API & SDK/2026-06-17Advanced

Making the Numbers Add Up in a Multi-Tenant Claude API SaaS — Field Notes on Isolation and Cost Attribution

The first thing that breaks when you make a Claude API SaaS multi-tenant is the month-end reconciliation. Here are field notes on a single metering chokepoint, atomic counters, reconciling against Anthropic's bill, and proving tenant isolation with adversarial tests — with production TypeScript.

⬡ API & SDK/2026-06-16Advanced

Keep a Decision Rationale Ledger for Autonomous Agents — So You Can Explain 'Why' Later

When an autonomous agent takes hard-to-reverse actions like a production deploy or a bulk delete, capture the chosen option, rejected alternatives, and assumptions in a structured ledger. Includes structured output, an append-only log, and tiering by impact.

⬡ API & SDK/2026-06-16Advanced

Taming Token Bloat in Long-Running Agents with Context Editing and the Memory Tool

For long-running agents whose input tokens balloon as tool results pile up, here is how to pair context editing with the memory tool and measure the savings with count_tokens, including a working backend implementation.

⬡ API & SDK/2026-06-16Advanced

Trusting Claude's Structured Output in Production — Validation Gates and Repair Loops

When Claude's structured output breaks 'occasionally' in production, combine tool-use enforcement, a schema validation gate, a single repair loop, and a graceful degradation fallback to eliminate broken JSON from your operations — with working TypeScript code.

⬡ API & SDK/2026-06-16Advanced

Confirm Your Model Actually Responds Before a Scheduled Run Begins

A model you configured can be gone before your nightly job even wakes up. Tell retirement, withdrawal, and regional restriction apart with a single startup probe, then rewrite the run config to an eligible model — with complete, working TypeScript.

⬡ API & SDK/2026-06-16Advanced

PII Masking for Claude API Lives or Dies on the Ledger — Restore, Encrypt, Measure

The hard part of masking PII before Claude API isn't detection — it's operating the token ledger you restore from. Encrypted storage, multi-instance sharing, and a daily leak-rate loop, with working code.

⬡ API & SDK/2026-06-15Advanced

On the day the billing change took effect, I added per-stage cost metering to my headless runs

The June 15 billing change moved headless runs and agent delegation onto monthly credits. Here is a thin metering layer that records token usage per stage tag from response.usage and emits a daily cost report, with working code.

⬡ API & SDK/2026-06-15Advanced

Centralizing the anthropic-beta Header So a Retired Beta Won't Kill Your Batch

Scattered anthropic-beta headers turn a beta retirement or GA graduation into a 400 that takes down an entire batch. A small capability registry, a startup preflight, and tiered fallback keep your pipeline running across feature generations.

⬡ API & SDK/2026-06-15Advanced

When a Model Disappears Without Warning: A State Machine for Retirement, Withdrawal, and Overload

A model can become unusable in hours for reasons that have nothing to do with a technical outage. This guide models three distinct flavors of 'unavailable'—retirement, withdrawal, and transient overload—as one availability state machine, with a router that keeps automated pipelines running. Working TypeScript and Python included.