CLAUDE LABJP
MCP — Enterprise-managed MCP connectors arrive: admins provision once, users get zero-touch access on first login (Okta, Team/Enterprise beta)LEGAL — 20+ legal MCP connectors and 12 practice-area plugins ship for research, contracts, and matter managementAGENTS — Code w/ Claude unveils Managed Agents: plan the work, fan out to hundreds of subagents, verify before returningLIMIT — The 5-hour Claude Code rate window is doubled for Pro, Max, Team, and seat-based EnterpriseBILLING — The June 15 Agent SDK credit split was paused; this usage stays within your subscription limitsFIX — Claude Code stability fixes continue: stuck spinners, subagent transcripts, and remote task statusMCP — Enterprise-managed MCP connectors arrive: admins provision once, users get zero-touch access on first login (Okta, Team/Enterprise beta)LEGAL — 20+ legal MCP connectors and 12 practice-area plugins ship for research, contracts, and matter managementAGENTS — Code w/ Claude unveils Managed Agents: plan the work, fan out to hundreds of subagents, verify before returningLIMIT — The 5-hour Claude Code rate window is doubled for Pro, Max, Team, and seat-based EnterpriseBILLING — The June 15 Agent SDK credit split was paused; this usage stays within your subscription limitsFIX — Claude Code stability fixes continue: stuck spinners, subagent transcripts, and remote task status
Articles/API & SDK
API & SDK/2026-06-20Advanced

Running Subagents in Parallel Without One Failure Sinking the Whole Run

A fan-out / fan-in design for running several subagents in parallel, covering token budgeting, a result contract, and partial-failure handling. Includes an implementation where one branch can fail without stopping the rest, plus measured numbers.

Claude API78Agent SDK3Subagents6ParallelismError Handling5

Premium Article

For a long time, as an indie developer running four sites solo at Dolice Labs, I collected nightly article candidates one site at a time. Roughly 40 seconds per site, about two and a half minutes for four in a row. It worked, but on a night where the network dropped midway through the third site, everything after it went down with it. Each morning I opened the log and found it stalled on site three, I was reminded how naive the design was.

There was never any real reason to run them in order. Candidate collection for each site is independent. So fire them in parallel, take results as they arrive, and pick up only the ones that failed afterward. That is fan-out / fan-in. Here I will build out the skeleton, plus the three things that are easy to overlook: budgets, pinning the result shape, and partial failure.

What breaks the moment you stop running serially

Parallelizing itself is not hard. What is hard are the three problems that surface the instant you do.

The first is budget. Serially you could naively count "up to N tokens overall," but in parallel several branches eat tokens at once. Hit the rate limit and every branch starts returning 429 together.

The second is the shape of results. Serially you handled one item at a time, almost by eye; in parallel the return order scatters, and a single piece of malformed JSON quietly collapses the aggregation step.

The third is partial failure. This one matters most. When one of four branches fails, throwing away the work of the other three defeats the point of parallelizing at all.

The fan-out / fan-in skeleton

First, define one worker: a plain function that takes a single site and returns a candidate list.

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
 
type Site = { id: string; domain: string; maxTokens: number };
 
async function collectCandidates(site: Site): Promise<string> {
  const res = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: site.maxTokens,
    system: "You are a technical blog editor. Return only a JSON array.",
    messages: [
      { role: "user", content: `Five article candidates for ${site.domain}, as a JSON array. Each element is {title, angle}.` },
    ],
  });
  const block = res.content.find((b) => b.type === "text");
  return block && block.type === "text" ? block.text : "[]";
}

The fan-out side launches this worker against every site at once and waits with Promise.allSettled. Choosing allSettled over Promise.all is the key. The latter rejects the whole set if even one branch rejects; the former returns every result while keeping success and failure distinct.

const sites: Site[] = [
  { id: "cl", domain: "claudelab.net", maxTokens: 1024 },
  { id: "gl", domain: "gemilab.net", maxTokens: 1024 },
  { id: "ag", domain: "antigravitylab.net", maxTokens: 1024 },
  { id: "rl", domain: "rorklab.net", maxTokens: 1024 },
];
 
const settled = await Promise.allSettled(
  sites.map((s) => collectCandidates(s))
);

This alone shrinks the wait from the serial sum down to roughly the slowest single branch. But as written, malformed responses still pass as successes. The next two sections tighten that.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A fan-out / fan-in implementation built on Promise.allSettled, and how to size per-branch token budgets
A result contract that pins down what each child returns with zod, so the parent safely rejects malformed responses
A decision table that routes partial failures to retry, dead-letter, or skip, with measured speedups over serial
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API & SDK2026-06-16
Keep a Decision Rationale Ledger for Autonomous Agents — So You Can Explain 'Why' Later
When an autonomous agent takes hard-to-reverse actions like a production deploy or a bulk delete, capture the chosen option, rejected alternatives, and assumptions in a structured ledger. Includes structured output, an append-only log, and tiering by impact.
API & SDK2026-04-26
Decoding Claude's 'Spanner Temporarily Unavailable' Error and How to Handle It
The 'Spanner temporarily unavailable' error occasionally appears in Claude API and Claude.ai responses. This guide unpacks what the message reveals about Anthropic's infrastructure and walks through practical retry strategies that production teams actually use.
API & SDK2026-04-25
Claude API × Convex: Reactive AI Apps — Data Flow, Streaming, and Agent Patterns
How to combine Convex's reactive database with the Claude API to build chat and agent applications that hold up in production. Covers schema design, the Action/Mutation/Query boundary, streaming, tool-call state, and the cold-start pitfalls nobody warns you about.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →