CLAUDE LABJP
MODEL — Claude Opus 4.8 lands, improving coding, agentic, and reasoning over 4.7 at the same priceCODE — Opus 4.8's Fast mode runs at 2.5x speed and is now three times cheaper than earlier modelsCODE — Auto-mode command classification expands, with denial tracking and live bash path autocompleteENTERPRISE — Connector permissions in custom roles let admins control which tools each role can useTEAM — Tag Claude directly in Slack and hand off tasks while you focus elsewhereMCP — MCP servers now show startup auth notices, making connection status easier to trackMODEL — Claude Opus 4.8 lands, improving coding, agentic, and reasoning over 4.7 at the same priceCODE — Opus 4.8's Fast mode runs at 2.5x speed and is now three times cheaper than earlier modelsCODE — Auto-mode command classification expands, with denial tracking and live bash path autocompleteENTERPRISE — Connector permissions in custom roles let admins control which tools each role can useTEAM — Tag Claude directly in Slack and hand off tasks while you focus elsewhereMCP — MCP servers now show startup auth notices, making connection status easier to track
Articles/API & SDK
API & SDK/2026-06-29Advanced

Let Claude Actually See the Images Your Tools Return — Use Image Blocks in tool_result and Cut Tokens by Roughly 10x

Stuffing a base64 string into a tool_result makes the same image cost roughly 10–20x more tokens. Here is how to return it as an image content block instead, with SDK code, a token-cost estimate, and the gotchas I hit in production.

Claude API92tool use4vision7tool_resulttoken optimization2

Premium Article

There is a trap in tool implementations that is surprisingly easy to miss: your tool returns an image, but Claude never actually sees it. I ran into this myself when I wrote a tool that lets an agent judge wallpaper thumbnails. The response came back fine, but the judgments were oddly vague. When I dug in, Claude was reading a long base64 string as text, not looking at the picture.

The annoying part is that a tool_result accepts almost anything, so the wrong shape still runs. It works, but it costs you. This article walks through how to return images so Claude genuinely sees them, with the actual numbers attached.

When your tool returns an image but Claude isn't looking

When you answer a tool_use, most implementations put a string into the content of a tool_result. For tools that return text, that is exactly right. But when you want to return an image, it is tempting to write this:

# Anti-pattern: stuffing the image base64 in as a "string"
tool_result = {
    "type": "tool_result",
    "tool_use_id": tool_use_id,
    "content": f"image data: {base64_png}",  # treated as text
}

The API will not raise an error here. Claude receives the base64 as text, and on the surface processing continues. But Claude never looks at the pixels, so it cannot make any judgment based on the image. Worse, those tens of thousands of base64 characters are billed as input tokens.

Reports in the official SDK repositories and community threads describe exactly this: tool-result images that are not converted into native image blocks and instead get sent as text, consuming around 15,000–25,000 tokens per image. The same image attached directly as a user message costs about 1,600 tokens, so the gap is roughly 10–20x. It is the classic case of paying ten times more for something that appears to work.

The correct shape is an image content block inside tool_result

The content of a tool_result accepts not just a string but an array of content blocks. Put an image block there and Claude recognizes it as an image and reads the pixels with its vision capabilities.

# Correct shape: make content an array and include an image block
tool_result = {
    "type": "tool_result",
    "tool_use_id": tool_use_id,
    "content": [
        {
            "type": "image",
            "source": {
                "type": "base64",
                "media_type": "image/png",
                "data": base64_png,
            },
        },
        {"type": "text", "text": "Here is the current thumbnail candidate. Rate its legibility."},
    ],
}

Two things matter: make content an array, and pass the image as an {"type": "image", ...} block. If you want to add a text note, just place a text block alongside it in the same array. On Claude's side, this image is handled exactly like an image a user attached, which means it is also billed at image rates rather than as tens of thousands of text tokens.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
Why putting base64 into a tool_result as a string makes the image count as text and burns roughly 15,000–25,000 tokens per image, and how to avoid it
The real cost when you return it as an image block (around 1,600 tokens) plus a formula to predict tokens from width and height before you send
A working agent loop that shows wallpaper thumbnails and App Store screenshots to Claude, with the size limits, media types, and the not-rendered-to-users pitfall
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API & SDK2026-06-16
Taming Token Bloat in Long-Running Agents with Context Editing and the Memory Tool
For long-running agents whose input tokens balloon as tool results pile up, here is how to pair context editing with the memory tool and measure the savings with count_tokens, including a working backend implementation.
API & SDK2026-06-16
Trusting Claude's Structured Output in Production — Validation Gates and Repair Loops
When Claude's structured output breaks 'occasionally' in production, combine tool-use enforcement, a schema validation gate, a single repair loop, and a graceful degradation fallback to eliminate broken JSON from your operations — with working TypeScript code.
API & SDK2026-06-13
Claude Vision API in Production — Implementation Patterns for Image Analysis, PDF Processing, and OCR
Implementation patterns for taking Claude's vision capabilities to production: choosing between Base64, URL, and the Files API, native PDF processing, schema-enforced extraction with Tool Use, batch cost reduction, and error recovery — all with working code.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →