CLAUDE LABJP
MODEL — Export controls on Claude Fable 5 are lifted, restoring global access starting July 1MODEL — Fable 5 is available across the Claude Platform, Claude.ai, Claude Code, and CoworkSCIENCE — Claude Science offers up to $30,000 in credits for research projects; apply by July 15CODE — Claude Code weekly limits are raised by 50% through July 13CODE — Dynamic workflows enter research preview with parallel, verified end-to-end task handlingCODE — A self-hosted gateway brings SSO, policy enforcement, and per-user cost attributionMODEL — Export controls on Claude Fable 5 are lifted, restoring global access starting July 1MODEL — Fable 5 is available across the Claude Platform, Claude.ai, Claude Code, and CoworkSCIENCE — Claude Science offers up to $30,000 in credits for research projects; apply by July 15CODE — Claude Code weekly limits are raised by 50% through July 13CODE — Dynamic workflows enter research preview with parallel, verified end-to-end task handlingCODE — A self-hosted gateway brings SSO, policy enforcement, and per-user cost attribution
Articles/API & SDK
API & SDK/2026-07-03Advanced

A 40% Lower Price Doesn't Mean a 40% Lower Bill — Measuring the Opus 4.8 to Sonnet 5 Migration by Cost per Completed Task

Sonnet 5's intro pricing looks ~40% cheaper than Opus 4.8, yet extra tool turns can flip the math. Working TypeScript for consumption vectors, a paired-run harness, and break-even turn counts.

Claude API101Sonnet 53cost engineeringmodel migrationTypeScript21

Premium Article

On July 2, Claude Sonnet 5 became the default model across plans, with introductory pricing of $2 per million input tokens and $10 per million output tokens. Next to Opus 4.8 at $5/$25, that is roughly 40% cheaper at standard rates and about 60% cheaper during the intro window. I switched the overnight batches for the blogs I run that same evening and opened the next morning's cost ledger expecting a satisfying drop.

The drop was about 18%. On a model that costs 60% less per token.

Cross-referencing the usage logs told the story: on my tool-loop tasks, the median turn count had risen from 5 to 7, and those two extra turns inflated input tokens far more than intuition suggests. If you judge a migration by the price table alone, this effect stays invisible until the invoice arrives.

This piece builds a different yardstick — cost per completed task — as one continuous design: recording consumption vectors, running old and new models side by side, and solving for the break-even turn count. It is a small mechanism, the kind an indie developer can bolt on in an afternoon, but it changes the quality of the migration decision noticeably.

Per-task cost is a dot product, not a price

What you pay per task is the dot product of a price vector and a consumption vector.

ComponentPrice sideConsumption side
Input$/MTok (input)Total input tokens sent until the task completed
Output$/MTok (output)Total generated tokens
Cache readsMuch cheaper read rateInput tokens served from cache
RetriesEvery component spent on failed attempts

Swapping models swaps the price vector instantly — but it changes the consumption vector too. Sonnet 5 is positioned as the most agentic Sonnet yet, with stronger planning and tool use, and in practice it does not call tools the same number of times or produce the same output length as Opus 4.8 on identical tasks. Some task families consume less, some consume more. Which means the sign of your savings cannot, even in principle, be read off the price table.

Turn count inflates input tokens quadratically

Each turn of a tool loop resends the whole conversation as input. With S for the system prompt plus initial context and d for the history added per round trip (tool_result plus the previous assistant output), the total input for an n-turn task is approximately:

total input ≈ n×S + d×(0 + 1 + ... + (n-1)) = n×S + d×n(n-1)/2

The second term grows with the square of n. Here are real dollars for a shape close to my link-checking agent — S = 3,000, d = 1,200 (an 800-token tool_result plus 400 tokens of prior output), 400 output tokens per turn:

Model and price4 turns6 turnsvs. Opus 4.8 at 4 turns
Opus 4.8 ($5/$25)$0.136$0.240baseline / +76%
Sonnet 5 intro ($2/$10)$0.054$0.096-60% / -29%
Sonnet 5 standard ($3/$15)$0.082$0.144-40% / +6%

At the same 4 turns, the discount tracks the price sheet exactly: 60% and 40%. Add two turns after the migration, though, and the intro-price saving shrinks to 29% — and at standard pricing, effective September 1, the task costs 6% more than it did on Opus 4.8. "We moved to the 40% cheaper model and the bill went up" is ordinary arithmetic for this task shape. Prompt caching softens the quadratic slope, but caches are scoped per model, so you cannot count on hits right after a switch — the dynamics I covered in the prompt-cache rewarm design for the Opus 4.8 to Sonnet 5 cutover.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
You'll be able to trace 'the price table says cheaper, but the bill barely moved' back to input tokens growing with the square of turn count
You can drop in a paired-run harness that runs the same task on both models and captures per-task effective cost and consumption profiles
You'll learn how to solve for the break-even turn count from your own prices and task shape, and make migration calls per task family
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API & SDK2026-07-02
Introductory Pricing Has an End Date — Effective-Dated Cost Forecasts for the Sonnet 5 Price Step
Claude Sonnet 5's introductory $2/$10 pricing ends on 2026-08-31 and reverts to $3/$15. A static price map will quietly understate your September forecast by a third. Here is an effective-dated price table and forecast design that absorbs the step.
API & SDK2026-07-01
When the Model Survives but One Parameter Expires: A Dated Deprecation Calendar for Claude API Requests
Your model ID can stay valid while a parameter you pinned quietly reaches its sunset date and takes the batch down with it. Here is a design that breaks a request into parts, gives each part its own expiry date, and catches the problem before the call goes out — with working TypeScript and real operational numbers.
API & SDK2026-07-01
When Claude API Document Extraction Is Confidently Wrong — Field Notes on Catching Silent Errors with Invariants
In structured extraction from invoices and contracts, the real danger isn't a crash — it's a value that's silently wrong while the schema validates and confidence reads high. Field notes on invariants, two-pass extraction, and tracking field-level error rates.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →