CLAUDE LABJP
MODEL — Claude Opus 4.8 improves coding, agentic, and professional work, with consistency for long-running tasksPLATFORM — The Developer Platform adds code execution, an MCP connector, a Files API, and prompt caching up to one hourSANDBOX — Claude Managed Agents now run in your own sandbox and connect to private MCP servers (Cloudflare/Daytona/Modal/Vercel)MODEL — Fable 5 (1M-token context, always-on adaptive thinking) was suspended on June 12 under a US export-control directiveLINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per taskMCP — Enterprise-managed MCP connectors (Okta) enable zero-touch access (Team/Enterprise beta)MODEL — Claude Opus 4.8 improves coding, agentic, and professional work, with consistency for long-running tasksPLATFORM — The Developer Platform adds code execution, an MCP connector, a Files API, and prompt caching up to one hourSANDBOX — Claude Managed Agents now run in your own sandbox and connect to private MCP servers (Cloudflare/Daytona/Modal/Vercel)MODEL — Fable 5 (1M-token context, always-on adaptive thinking) was suspended on June 12 under a US export-control directiveLINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per taskMCP — Enterprise-managed MCP connectors (Okta) enable zero-touch access (Team/Enterprise beta)
Articles/API & SDK
API & SDK/2026-06-22Advanced

Putting a Ceiling on the pause_turn Loop: Running Long Server Tools Safely Unattended

A production design for continuing pause_turn safely in unattended runs, where long server tools like web_search and code execution are involved. Covers branching all four stop_reason values in one loop, capping continuations and wall-clock time, and accumulating usage across paused segments.

Claude40API24pause_turntool-use19production100

Premium Article

As an indie developer, I batch-generate articles for several sites overnight, and one morning a few of them were missing the fresh information they were supposed to have searched for. Nothing in the error log. When I opened a saved response, stop_reason was pause_turn — and my generation loop had happily stopped there. web_search hadn't finished in a single round trip; it had returned a "pause," and my loop read that pause as completion.

You don't see pause_turn often, because short prompts never produce it. But the moment you involve a long server tool — web_search, web_fetch, or code execution — it can show up. And it arrives as a normal, successful response, not an exception, so as long as you swallow it you'll never notice. In unattended runs, that's exactly where silent truncation hides.

pause_turn Is a Third State, Neither Error Nor Done

If you treat every stop_reason as "the reason the response ended," pause_turn will trip you up every time. The starting point is to split the values into "finished" and "still going."

stop_reasonStateWhat you must do next
end_turnDone normallyNothing. The output is final
max_tokensCut off mid-outputDecide: continue, or record as incomplete
tool_useContinue (client side)Append a tool_result and re-request
pause_turnContinue (server side)Append the response as-is and re-request
refusalSafety refusalDon't retry; handle it by design

tool_use and pause_turn look similar, but what you append differs. With tool_use you run the tool yourself and add a new user message containing the tool_result. With pause_turn, the partial output from the server-side tool is already inside the assistant response, so you append the response blocks as-is, with no extra input, and the turn keeps going. The basic branching itself is laid out in implementation patterns for not dropping stop_reason, so if you aren't even checking max_tokens yet, start there. This piece focuses on what comes after: how to design for pause_turn when you're running long tools unattended.

Reproduce It First — Which Tools Produce pause_turn

Before defending against it blindly, it helps to make your own code emit a pause_turn once. Enable a server-side tool and ask something that likely needs several searches.

import anthropic
 
client = anthropic.Anthropic(api_key="YOUR_API_KEY")
 
resp = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=4096,
    tools=[{"type": "web_search_20260318", "name": "web_search"}],
    messages=[{
        "role": "user",
        "content": "Summarize three June 2026 Claude Developer Platform updates with sources",
    }],
)
print(resp.stop_reason)        # may be pause_turn
print([b.type for b in resp.content])
# e.g. ['text', 'server_tool_use', 'web_search_tool_result', 'text']

The thing to internalize: even on pause_turn, content already holds partial blocks — text, server_tool_use, web_search_tool_result. It is not an empty response. That's precisely why swallowing it leaves you with a half-finished body presented as final. My own first mistake was exactly this: mistaking the in-progress text for the finished product.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Branch pause_turn, tool_use, end_turn, and max_tokens in a single continuation loop so server tools never get silently truncated
Add a continuation cap, a wall-clock budget, and cross-segment usage accumulation so a runaway turn can't quietly rack up cost in an overnight batch
Take away paste-ready code for streaming event ordering and an observability log that stops you from misreading pause_turn as end_turn
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API & SDK2026-04-26
Reading Claude API stop_reason Correctly — A Production Guide to end_turn, max_tokens, pause_turn, and refusal
Branching on Claude API's stop_reason properly eliminates a surprising number of production incidents — truncated outputs, missed tool continuations, wasted retries. Here is how to tell end_turn, max_tokens, pause_turn, and refusal apart.
API & SDK2026-04-23
Production Prompt-Injection Defense for the Claude API — Detection, Sanitization, and Layered Guardrails
A practical, code-first design guide for defending Claude API applications against prompt injection — covering input sanitization, channel separation, output validation, and red-teaming for long-term safety.
API & SDK2026-06-21
Connecting Managed Agents to Services You Don't Want to Expose: MCP Tunnel Design
How to connect Claude Managed Agents to an internal MCP server that is never exposed to the public internet. We cover the MCP tunnel, self-hosted sandboxes, authorization boundaries, and graceful degradation when things break.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →