CLAUDE LABJP
MODEL — Export controls on Claude Fable 5 are lifted, restoring global access starting July 1MODEL — Fable 5 is available across the Claude Platform, Claude.ai, Claude Code, and CoworkSCIENCE — Claude Science offers up to $30,000 in credits for research projects; apply by July 15CODE — Claude Code weekly limits are raised by 50% through July 13CODE — Dynamic workflows enter research preview with parallel, verified end-to-end task handlingCODE — A self-hosted gateway brings SSO, policy enforcement, and per-user cost attributionMODEL — Export controls on Claude Fable 5 are lifted, restoring global access starting July 1MODEL — Fable 5 is available across the Claude Platform, Claude.ai, Claude Code, and CoworkSCIENCE — Claude Science offers up to $30,000 in credits for research projects; apply by July 15CODE — Claude Code weekly limits are raised by 50% through July 13CODE — Dynamic workflows enter research preview with parallel, verified end-to-end task handlingCODE — A self-hosted gateway brings SSO, policy enforcement, and per-user cost attribution
Articles/Claude Code
Claude Code/2026-07-03Advanced

Five Minutes of Silence, and Something Retries on Your Behalf — Rethinking Retry Ownership After the Streaming Idle Watchdog Became a Default

Claude Code's streaming idle watchdog is now on by default, quietly adding another retrying layer to your stack. This article inventories the four layers (SDK, wrapper, watchdog, scheduler), computes worst-case attempt amplification, and shows how to collapse retry ownership into a single layer.

claude-code124reliability13retry7automation82production107typescript11

Premium Article

The Claude Code release notes for July 1 contained one short line: the streaming idle watchdog is now enabled by default, aborting and retrying any stream that stays silent for five minutes.

It should have been a welcome change. But what came to mind first was not gratitude — it was a cross-section of my own pipeline. The SDK retries. My wrapper retries. The scheduler re-runs failed jobs. And now a layer I never asked for had joined them. The number of retrying layers had quietly grown to four.

More layers are not inherently bad. The trouble is that each layer retries "helpfully" without knowing the others exist. This article walks through counting the places where retries can originate, estimating the worst-case attempt count mechanically, and collapsing responsibility into a single layer — based on how I reorganized my own overnight jobs.

Counting the layers that retry

Start with an inventory. A typical unattended Claude stack has at least four sources of retries.

LayerWhat it retriesDefault behaviorWhy it hides
1. Anthropic SDKConnection errors, 429s, 5xxmaxRetries: 2 (up to 3 attempts including the first)Never appears in your code, so it escapes inventories
2. Your wrapperApplication-level failuresYour own exponential backoff (say, 3 attempts)Known, but its overlap with the SDK is easy to forget
3. Streaming idle watchdogStreams silent for 5 minutesAbort + retry (on by default since 7/1)Behavior changed with no config change on your side
4. SchedulerThe whole jobRe-run on failure, or re-kick on the next tickThe outermost layer, invisible from inside the job

Notice that only one of these four layers is code you wrote. Layer 1 is an SDK default, layer 3 is a platform default that just changed, layer 4 is operational configuration. Most of your retry design lives outside your own repository.

The watchdog's exact retry count and interval are assumptions worth verifying against the release notes and observed behavior for your version. Defaults can change without warning — as this one just did — which is why they belong in the assumptions log described below.

Worst cases multiply, they do not add

Retries across layers compose by multiplication, not addition: each attempt of an outer layer can consume every attempt of the inner layers.

  • SDK: 3 attempts
  • Wrapper: 3 attempts
  • Watchdog: 2 attempts (assuming 1 retry)
  • Scheduler: 2 attempts

The worst case for this configuration is 3 × 3 × 2 × 2 = 36 attempts. A task you wrote as "one call" can hit the API 36 times on a bad night.

Translate that into money. A task with roughly 12,000 input and 3,000 output tokens at Sonnet 5's introductory pricing ($2/$10 per MTok) costs about $0.054 per attempt. Thirty-six attempts is about $1.94. Run 90 such tasks overnight and the theoretical worst case is about $175 in a single night. The budget you built around cheap unit prices evaporates through amplification.

ConfigurationWorst-case attemptsWorst-case cost per task90-task overnight batch
All four layers at defaults36≈ $1.94≈ $175
Single-owner (4 attempts, below)4≈ $0.22≈ $19

The interaction with 429s is even more serious. A 429 is a response to overload; if four layers each dutifully retry it, you become an amplifier applying 36× pressure on an already congested night. Honoring server guidance — the approach I described in the Retry-After backoff strategy notes — only works once exactly one layer is doing the retrying.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A TypeScript calculator that enumerates SDK, wrapper, watchdog, and scheduler layers and estimates worst-case attempt counts and wall-clock time by multiplication
A single-owner retry pattern with concrete steps to stop SDK retries with maxRetries 0 and grant retries only to the outermost layer that holds an idempotency key
An operational checklist that logs platform defaults as run-time assumptions at job start, so retry amplification is caught before an incident instead of after
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Claude Code2026-06-14
Running Claude Code Hooks as a Quality Gate Without Breaking Your Pipeline
An implementation note on running Claude Code Hooks as a safety valve for automation: when to block with exit code 2 versus JSON output, how to keep formatters from looping or over-blocking, and how to log every hook firing so misfires are traceable.
Claude Code2026-06-12
A Three-Tier fallbackModel Setup for Claude Code — Keeping Unattended Runs Alive Through Overload Mornings
How I run Claude Code with a three-tier fallbackModel chain so overnight batches survive overload errors: logging which model actually ran, measuring quality drift on fallback days, and pairing it with deny rules.
Claude Code2026-06-25
The Day a Non-Responding MCP Call Swallowed an Entire Unattended Run — Owning the Stop With Your Own Deadline
When a remote MCP tool call stops responding, an unattended scheduled run just keeps waiting. Instead of leaving the cutoff entirely to the platform, here is how I designed my own deadline and a per-connector circuit breaker to own the stop — with working code.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →