TAG

Resilience

11 articles

production⁸ claude-api⁵ fallback⁵ api³ Claude API² overloaded² 529² circuit-breaker² troubleshooting² error-handling² typescript² 413¹

⬡ API & SDK/2026-07-12Advanced

Designing Around Claude API 413 request too large — Preflight Sizing and Splitting

Pack too much text, images, and tool_result into one request and Claude API rejects it with 413 request too large. Here is a code-backed design for measuring request bytes before you send, telling the two kinds of 413 apart, and splitting requests without breaking them.

⬡ API & SDK/2026-06-22Advanced

Claude API Streaming Breaks the "Everything Arrives" Assumption — Field Notes on Recovering from Partial Failure

Once concurrency climbs, Claude API streams disconnect mid-response, replay events, and emit half-finished tool arguments. Treating partial failure as the norm rather than an anomaly, here is how I rebuilt the implementation and monitoring to recover quietly.

⬡ API & SDK/2026-06-15Advanced

Centralizing the anthropic-beta Header So a Retired Beta Won't Kill Your Batch

Scattered anthropic-beta headers turn a beta retirement or GA graduation into a 400 that takes down an entire batch. A small capability registry, a startup preflight, and tiered fallback keep your pipeline running across feature generations.

⬡ API & SDK/2026-06-15Advanced

When a Model Disappears Without Warning: A State Machine for Retirement, Withdrawal, and Overload

A model can become unusable in hours for reasons that have nothing to do with a technical outage. This guide models three distinct flavors of 'unavailable'—retirement, withdrawal, and transient overload—as one availability state machine, with a router that keeps automated pipelines running. Working TypeScript and Python included.

⬡ API & SDK/2026-05-26Advanced

Designing Graceful Degradation for the Claude API — A Four-Tier Fallback Architecture That Keeps AI Features Quietly Alive

Once Claude API features hit real production traffic, model-level fallback alone stops being enough. This article walks through an SLI-driven four-tier degradation design, with Python and TypeScript code, SLO burn-rate alerting, and the operational trade-offs an indie developer actually runs into.

⬡ API & SDK/2026-05-23Advanced

Absorbing Claude API 529 Overloaded in Production — Resilience Patterns from a 50M-Download Indie Studio

529 Overloaded won't go away with a naive exponential backoff. Drawing on lessons from 50 million app downloads, this piece walks through queue-based absorption, model-aware fallback, and circuit-breaker design with working code.

⬡ API & SDK/2026-04-23Advanced

High-Availability Patterns for the Claude API — Making Sonnet/Haiku/Opus Fallback Work in Production

A single-model Claude API integration will fall over the first time rate limits or a regional hiccup land at peak hours. This is the production pattern for a Sonnet → Opus → Haiku fallback chain, with circuit breakers, streaming coverage, and the pitfalls you only learn the hard way.

⟐ Claude Code/2026-04-23Advanced

Finishing Long-Running Claude Code Tasks: A Resilience Playbook You Can Ship

Multi-hour Claude Code jobs — bulk refactors, TypeScript migrations, mass test generation — always stop before they finish, and recovery is painful when you cannot tell what already ran. This guide ships concrete patterns: a checkpoint-driven manifest, a three-state circuit breaker, idempotent retry rules, and a freeze-and-resume protocol you can copy into your repo today.

⬡ API & SDK/2026-04-22Intermediate

Handling Frequent 529 Overloaded Errors from the Claude API — A Practical Playbook

A 529 Overloaded response from the Claude API is a very different animal from a 429 rate limit. Here is the retry, fallback, and circuit breaker playbook I actually use in production to keep services responsive when Anthropic's platform is temporarily saturated.

⬡ API & SDK/2026-03-31Advanced

Building Self-Healing AI Agents with Claude API — Error Detection, Auto-Recovery, and Graceful Degradation Patterns for Production

Learn how to build production-grade AI agents that automatically detect failures and self-heal using Claude API. Covers retry strategies, fallback chains, Supervisor patterns, and observability pipelines.

⬡ API & SDK/2026-03-27Advanced

Claude API Production Resilience Patterns — Model Routing, Circuit Breakers, and Fallback Strategies for Indie Teams

Production resilience patterns for Claude API: circuit breakers, intelligent model routing, fallback chains, exponential backoff with jitter, and disaster recovery — with TypeScript implementations and operational lessons from running Dolice Labs across four sites as an indie developer.