CLAUDE LABJP
BILLING — The Jun 15 change is now live: Agent SDK, headless runs, GitHub Actions, and third-party agents leave subscription limits for separate monthly credits ($20/$100/$200) metered at full API rates, no rolloverRETIRED — As of today, Sonnet 4 and Opus 4 are retired from the API; scripts referencing older models should switch to the latest generation such as Opus 4.8EXPORT — Claude Fable 5 and Mythos 5 are suspended for all foreign nationals under a US export-control directive (Jun 12); Anthropic calls it a misunderstanding and is working to restore accessSAFE — Only the two new Mythos-class models are affected; every other model including Opus 4.8 keeps running normallySUBAGENTS — Claude Code sub-agents can now spawn their own sub-agents (up to 5 levels), and Dynamic workflows arrived in research previewINCIDENT — A Jun 5 outage raised error rates across claude.ai, the API, Claude Code, and Cowork, a reminder to design retries and fallbacks into automated runsBILLING — The Jun 15 change is now live: Agent SDK, headless runs, GitHub Actions, and third-party agents leave subscription limits for separate monthly credits ($20/$100/$200) metered at full API rates, no rolloverRETIRED — As of today, Sonnet 4 and Opus 4 are retired from the API; scripts referencing older models should switch to the latest generation such as Opus 4.8EXPORT — Claude Fable 5 and Mythos 5 are suspended for all foreign nationals under a US export-control directive (Jun 12); Anthropic calls it a misunderstanding and is working to restore accessSAFE — Only the two new Mythos-class models are affected; every other model including Opus 4.8 keeps running normallySUBAGENTS — Claude Code sub-agents can now spawn their own sub-agents (up to 5 levels), and Dynamic workflows arrived in research previewINCIDENT — A Jun 5 outage raised error rates across claude.ai, the API, Claude Code, and Cowork, a reminder to design retries and fallbacks into automated runs
Articles/Claude.ai
Claude.ai/2026-04-03Intermediate

Claude Sonnet 4.6 — 1M Tokens, Computer Use & Extended Thinking in Production

Claude Sonnet 4.6 production guide: 1M tokens, Computer Use 72.5, Extended Thinking, Opus vs Sonnet cost comparison, and Prompt Caching optimization with code.

claude-sonnet-46claude-ai15extended-thinking6computer-use4production92cost-optimization19intermediate2

Premium Article

Setup and context — Why Developers Prefer Sonnet 4.6 Over Opus 4.5

On February 17, 2026, Anthropic launched Claude Sonnet 4.6, and the reception exceeded expectations. Developers who gained early access consistently reported preferring Sonnet 4.6 over Opus 4.5 — Anthropic's previous flagship model — for the majority of real-world tasks. This wasn't a surprise to the team; Sonnet 4.6 was engineered with a specific focus on the tasks that matter most in practice: coding, computer use, long-context reasoning, agentic planning, and knowledge work.

The signal was clear when Anthropic made Sonnet 4.6 the default model across claude.ai and Claude Cowork. This move effectively said: "For most of what you need to accomplish, Sonnet 4.6 is the right tool."

But what makes Sonnet 4.6 genuinely different, and how do you unlock its full potential in production systems? This guide answers those questions with technical depth, working code, and practical decision frameworks you can apply immediately.


Key Specifications and Performance Benchmarks

Context Window

Claude Sonnet 4.6 supports a 1,000,000-token (1M token) context window. To put this in perspective, that's approximately 750,000 words in English — equivalent to around 2,500 pages of text. This isn't just a headline number; it fundamentally changes how you can architect AI applications.

One important note: the 200K context window beta for Claude Sonnet 4.5 and Claude Sonnet 4 is being retired on April 30, 2026. Requests exceeding the standard window after that date will return errors. Now is the time to migrate to Sonnet 4.6's native 1M support.

Computer Use Performance

Sonnet 4.6 scored 72.5 on the OSWorld-Verified benchmark for computer use. For context, Sonnet 3.7 scored 28.0 on a comparable benchmark roughly a year earlier. That's a 2.5x improvement in one year — and it represents the difference between a curiosity and a genuinely useful automation tool.

A 72.5% success rate means that in roughly three out of four attempts, Sonnet 4.6 will correctly complete a computer interaction task. That level of reliability opens the door to real-world workflow automation at scale.

Extended Thinking

Sonnet 4.6 supports Extended Thinking, allowing the model to work through complex problems systematically before delivering its response. This dramatically improves accuracy on tasks involving multi-step reasoning, mathematical derivations, system design, and nuanced judgment calls.

Pricing and Rate Limits

Sonnet 4.6 maintains the same pricing as Sonnet 4.5:

  • Input tokens: $3 per 1M tokens
  • Output tokens: $15 per 1M tokens
  • Prompt Caching (read): $0.30 per 1M tokens (90% discount)
  • Prompt Caching (write): $3.75 per 1M tokens

Additionally, the Messages Batches API max_tokens cap has been raised to 300,000 for Sonnet 4.6, enabling longer outputs for long-form content, large code generation tasks, and structured data extraction at scale.


Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Master a quantitative model selection framework to decide when Sonnet 4.6 beats Opus 4.6 — and save up to 80% on API costs
Get working code for 1M token context, Extended Thinking, Computer Use, streaming, and Prompt Caching in one place
Learn production-grade cost optimization combining Prompt Caching and Batch API for up to 90% cost reduction
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

Claude.ai2026-04-29
Make Claude Your Production Debugging Companion: A Practical Design for Log Triage, Hypothesis Generation, and Repro Scripts
A field-tested blueprint for solo developers who carry their own pager. We split production debugging into three jobs Claude can actually own — log summarization, hypothesis generation, and minimal repro — with full prompts, sanitization code, and traps that cost me real downtime.
API & SDK2026-05-05
The Real Cost of Claude API Extended Thinking in Production — ROI Data by Task Type
Three months of measured cost, quality, and speed data for Extended Thinking across five task categories. Learn exactly when extended thinking is worth it—and when it's not.
Claude.ai2026-06-15
When a Long-Running Agent's Context Quietly Decays — Budgeting and Compaction
An agent that runs all night gets sloppier by morning. The cause is dilution from accumulated context. Here is how to treat context as a budget, measure its decay, and keep it healthy with compaction — with working code and field notes.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →