When Thinking Is Always On, Prefill Quietly Stops Working — Fixing Streaming and Token Budgets for Fable 5
Fable 5 thinks by default. Prefill no longer applies, the first streamed block isn't text, and max_tokens has to leave room for reasoning. Here is how I fixed those three broken assumptions in my own automated publishing pipeline.
Using Extended Thinking with Claude Code in 2026: A
A practical guide to using Claude 4's Extended Thinking feature with the Claude Code CLI and API. Learn how to set thinking budgets, handle streaming, and use it where it actually helps in production.
When Extended Thinking 'Does Not Work': 7 Causes That Hide Behind the Same Symptom
When you turn on Extended Thinking but the response feels identical to before, the cause is usually one of seven distinct problems. This guide walks through how to diagnose each from the API, the chat UI, the SDK, and the model layer.
Complete Troubleshooting Guide: Claude Extended Thinking Stops, Times Out, or Loops
Extended Thinking stopping mid-process, hitting timeouts, or consuming unexpected costs? This guide covers root causes, correct budget_tokens configuration, streaming patterns, retry handling, and cost optimization strategies.