●SLACK — Claude Tag rolls out to teams on Slack: tag @Claude into channels to delegate tasks and connect tools, data, and codebases●MODEL — The Opus class gets an upgrade, with stronger coding, agentic, and professional work plus consistency for long-running tasks●CODE — Claude Code adds dynamic workflows in research preview, letting Claude break complex work into steps on its own●CODE — The new ultracode setting raises effort to xhigh while letting Claude decide when to use a workflow●SECURITY — Anthropic says operators linked to Alibaba's Qwen lab tried to access Claude via thousands of fraudulent accounts●LINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per task●SLACK — Claude Tag rolls out to teams on Slack: tag @Claude into channels to delegate tasks and connect tools, data, and codebases●MODEL — The Opus class gets an upgrade, with stronger coding, agentic, and professional work plus consistency for long-running tasks●CODE — Claude Code adds dynamic workflows in research preview, letting Claude break complex work into steps on its own●CODE — The new ultracode setting raises effort to xhigh while letting Claude decide when to use a workflow●SECURITY — Anthropic says operators linked to Alibaba's Qwen lab tried to access Claude via thousands of fraudulent accounts●LINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per task
We Collected Plenty of Claude Code Usage Data — Then Budgeting Still Didn't Work. Field Notes on Cost Attribution and Idle Seats
The Claude Code Analytics API gives you data, but data alone doesn't run a budget. Here are field notes on turning raw usage logs into cost attribution, idle-seat detection, and alerts that don't cry wolf — with working code and the thresholds we actually use.
The data was there, but the meeting couldn't use it
When we rolled out the Claude Code Analytics API, the first thing I built was a dashboard of daily token consumption and cost. The numbers came out cleanly. But every monthly budget meeting stalled on the same question: "This figure — which work did it pay for?"
You can see per-seat consumption. The trouble is that seats and work aren't one-to-one. One person spans several repositories, spends one week firefighting an incident and the next writing a feature in a sprint. Per-seat numbers answer "who used it"; what budgeting actually wants to know is "what it was used on." Running several apps and blogs in parallel as an indie developer myself, I'd already learned that cost has to be tied to work, not to people, before it becomes something you can act on.
This article walks through the three stages it took to turn the API's raw data into something usable in a budget meeting — attributing cost to workstreams, detecting idle seats, and building alerts that don't go stale — with the code we actually run. The focus isn't setup; it's the part where you get stuck after the data is already flowing.
First, pin down the granularity of the raw data
The Analytics API returns metrics per day. The first job is to understand exactly what granularity comes back. Get vague here and every downstream cost attribution drifts.
import osimport httpxBASE = "https://api.anthropic.com/v1/organizations/usage_report/claude_code"def fetch_daily(starting_at: str, ending_at: str) -> list[dict]: """Fetch daily Claude Code usage records. Note: one record = one day x one user x one model.""" headers = { "x-api-key": os.environ["ANTHROPIC_ADMIN_API_KEY"], "anthropic-version": "2023-06-01", } records, page = [], None with httpx.Client(timeout=30) as client: while True: params = {"starting_at": starting_at, "ending_at": ending_at, "limit": 1000} if page: params["page"] = page r = client.get(BASE, headers=headers, params=params) r.raise_for_status() body = r.json() records.extend(body["data"]) page = body.get("next_page") if not page: break return records
Two things matter here. First, each record is a "day x user x model" tuple. If the same person uses both Sonnet and Opus on the same day, their single day splits into two records. Aggregate over the wrong key and a mixed-model seat double-counts or quietly drops cost.
Second, always page all the way through. A larger limit won't return everything; you have to loop until next_page is null. On a large team near month-end, skipping this silently loses a few days of data. I missed this early on and once argued a budget off the first-of-month numbers alone — and was wrong.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦An aggregation that attributes cost to workstreams instead of seats, and why that framing changes the conversation
✦Idle-seat logic that surfaces the seats you pay for but nobody actually uses, every week
✦A three-part alert threshold (week-over-week, moving average, absolute floor) that stays meaningful instead of going stale
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
What worked in the budget meeting was re-pointing cost at "repository groups = workstreams" rather than at people. The Analytics API alone doesn't expose fine-grained repository data, so we join it against an internal ledger of who primarily owns which project. The ledger doesn't need to be perfect. If 80% of the primary owners are right, the discussion moves.
from collections import defaultdict# Ledger: user -> workstream (their primary project)SEAT_TO_LINE = { "alice@example.com": "Payments platform", "bob@example.com": "Mobile app", "carol@example.com": "Mobile app", "dave@example.com": "Internal tools",}# Approximate unit price per model (USD per 1M tokens; update from your billing actuals)PRICE_PER_MTOK = { "claude-opus-4-8": {"input": 15.0, "output": 75.0}, "claude-sonnet-4-6": {"input": 3.0, "output": 15.0},}def cost_of(record: dict) -> float: model = record.get("model", "") price = PRICE_PER_MTOK.get(model) if not price: return 0.0 inp = record.get("input_tokens", 0) / 1_000_000 out = record.get("output_tokens", 0) / 1_000_000 return inp * price["input"] + out * price["output"]def cost_by_line(records: list[dict]) -> dict[str, float]: totals = defaultdict(float) for rec in records: line = SEAT_TO_LINE.get(rec.get("actor_email"), "Unassigned") totals[line] += cost_of(rec) return dict(sorted(totals.items(), key=lambda kv: kv[1], reverse=True))
Once this aggregation was in place, the meeting's question shifted from "who's expensive" to "which workstream are we investing in." The latter isn't an accusation; it's a decision. Being able to say "the payments platform carried 40% of last month — is that intentional?" turns the cost conversation forward-looking.
When "Unassigned" starts growing, that's your signal to refresh the ledger. I use a loose rule: if unassigned exceeds 15% of the total, revisit it. Keeping a way to detect the drift lasts longer than chasing a clean 0%.
Stage 2: surface the seats you pay for but nobody uses
The biggest cost win wasn't trimming expensive seats — it was returning seats nobody used. Because Claude Code involves per-seat billing, an inactive seat is just fixed cost. Session counts and lines-changed from the Analytics API let you flag idle seats mechanically.
def idle_seats(records: list[dict], roster: set[str], days: int = 14) -> list[dict]: """Return seats with little or no activity over the last `days`. Rule: zero sessions, or (sessions exist but very few lines added).""" activity = defaultdict(lambda: {"sessions": 0, "lines_added": 0}) for rec in records: a = activity[rec.get("actor_email")] a["sessions"] += rec.get("num_sessions", 0) a["lines_added"] += rec.get("lines_added", 0) flagged = [] for email in roster: a = activity.get(email, {"sessions": 0, "lines_added": 0}) if a["sessions"] == 0: flagged.append({"email": email, "reason": "completely idle", **a}) elif a["lines_added"] < 20: flagged.append({"email": email, "reason": "barely active", **a}) return sorted(flagged, key=lambda x: (x["sessions"], x["lines_added"]))
The two-stage rule is deliberate, because session count alone misleads you. A seat that only logs in and does no real work looks "used" by session count. Adding lines-added as a second angle surfaces the "opened it but didn't write" seats.
One caution: don't use the idle list for performance reviews. Keep it strictly as material for reallocating seats. Parental leave, a long design phase, a review-only role — there are plenty of legitimate reasons not to write code. I always add a "one-line reason" column to this list so a human's context can override the machine's verdict. The numbers start a conversation; they don't end one.
Stage 3: design alerts that don't go stale from false positives
My first cost alert was "notify when a day's spend crosses a threshold." Within two weeks nobody looked at it, because the natural dip on weekends and rise on weekdays set it off constantly. An alert is worthless unless you can keep a state where firing always means acting.
So I combined three conditions and switched to notifying only when several fire at once — not any single one.
Condition
What it catches
Threshold we use
vs. same weekday last week
A spike with the weekday cycle removed
+60% or more
Deviation from 7-day moving average
A slow upward trend
1.5x the average or more
Absolute floor
Suppresses noise while amounts are small
Ignore days under $50
def should_alert(today_cost: float, same_dow_last_week: float, ma7: float) -> bool: if today_cost < 50: # stay quiet while amounts are small return False spike_vs_lastweek = same_dow_last_week > 0 and today_cost >= same_dow_last_week * 1.6 spike_vs_trend = ma7 > 0 and today_cost >= ma7 * 1.5 return spike_vs_lastweek and spike_vs_trend
The absolute floor made a real difference. Early in a rollout, or on a small team, a ratio doubles easily but the dollar amount is rounding error. With ratio conditions alone, you'd fire constantly during ramp-up and lose trust.
Joining the two ratio conditions with and is also deliberate. Week-over-week alone fires after a holiday; the moving average alone fires from a pre-long-weekend rush. Both firing together only happens for "a trend-level spike that isn't a weekday effect or a temporary wave." In practice, the fewer alerts you emit, the more each one is worth.
Looking back
The Analytics API returns raw numbers, and on their own they don't solve budgeting. What worked was: (1) attributing cost to workstreams instead of people to keep the discussion forward-looking, (2) surfacing unused seats weekly rather than chasing expensive ones, and (3) narrowing alerts to multiple conditions firing together to kill false positives. None of these are flashy features — they're a thin translation layer between raw data and the context on the ground.
As a first step, I'd suggest putting a single spreadsheet ledger of workstreams together and running just cost_by_line. The moment you can see "how much each kind of work costs," you'll naturally know whether you need the other two. I hope these notes help if you're stuck on cost management for a team rollout too.
Share
Thank You for Reading
Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.