CLAUDE LABJP
WWDC — WWDC 2026 confirms Siri runs on Google Gemini; third-party handoff to ChatGPT is dropped, and Siri AI won't ship in the EU under the DMA at iOS 27BILLING — 6 days until the Jun 15 change: Agent SDK, headless Claude Code, GitHub Actions, and third-party agents move to API-rate monthly creditOUTAGE — claude.ai, Claude Code, and Cowork saw an outage (Jun). Scheduled runs are safest when built around fallbackModel and retriesDYNAMIC-WORKFLOWS — Dynamic workflows are on by default on Max/Team and the API, for codebase-wide bug hunts and independent verificationULTRACODE — Claude Code's new ultracode setting sits in the effort menu, fixing effort to xhigh while Claude decides when to run a workflowOPUS4.8 — Claude Opus 4.8 is settled in as the default across major plans, with stronger coding, agentic, and reasoning skillsWWDC — WWDC 2026 confirms Siri runs on Google Gemini; third-party handoff to ChatGPT is dropped, and Siri AI won't ship in the EU under the DMA at iOS 27BILLING — 6 days until the Jun 15 change: Agent SDK, headless Claude Code, GitHub Actions, and third-party agents move to API-rate monthly creditOUTAGE — claude.ai, Claude Code, and Cowork saw an outage (Jun). Scheduled runs are safest when built around fallbackModel and retriesDYNAMIC-WORKFLOWS — Dynamic workflows are on by default on Max/Team and the API, for codebase-wide bug hunts and independent verificationULTRACODE — Claude Code's new ultracode setting sits in the effort menu, fixing effort to xhigh while Claude decides when to run a workflowOPUS4.8 — Claude Opus 4.8 is settled in as the default across major plans, with stronger coding, agentic, and reasoning skills
Articles/API & SDK
API & SDK/2026-03-23Advanced

Claude API Programmatic Tool Calling (PTC) Production Guide — 10x Faster Multi-Tool Workflows

Master Programmatic Tool Calling (PTC) in the Claude API to dramatically reduce latency and token costs in multi-tool workflows. Learn production patterns combining PTC with Tool Search and Input Examples.

Claude API99Programmatic Tool CallingPTCTool Use8AI Agents2advanced25

What Is Programmatic Tool Calling (PTC)?

When building AI agents with the Claude API, tool call latency and token costs have always been significant challenges. Traditional tool use requires Claude to request one tool at a time, wait for the result, then request the next — creating costly round trips. A five-tool workflow means five separate model inference passes.

Programmatic Tool Calling (PTC) solves this from the ground up. Instead of orchestrating tools through natural language, Claude writes Python code inside a sandboxed execution environment to call multiple tools programmatically, minimizing round trips and giving you precise control over what enters the context window.

# Traditional Tool Use: 5 round trips
# Tool 1 → result → Tool 2 → result → Tool 3 → result → ...
 
# PTC: Claude writes code to orchestrate everything
async def research_workflow(query):
    # Claude calls tools directly within generated code
    search_results = await tool.web_search(query)
    relevant_urls = [r["url"] for r in search_results[:5]]
 
    # Parallel page fetching
    pages = await asyncio.gather(*[
        tool.fetch_page(url) for url in relevant_urls
    ])
 
    # Filter in code — only summaries enter Claude's context
    summaries = [extract_key_points(page) for page in pages]
    return summaries

The result: up to 10x latency improvement and dramatically lower token consumption for multi-tool workflows.

Why PTC Matters — The Limits of Traditional Tool Use

The Context Window Problem

With traditional tool use, every tool call result enters Claude's context window in full. For data-heavy workflows, intermediate results can crowd out the information that actually matters.

# Traditional problem: all data enters context
# Fetch 1000 rows from database → all 1000 rows in context
# → you only needed the aggregate, but tokens are already spent
 
# PTC solution: filter in code
results = await tool.query_database("SELECT * FROM orders WHERE date > '2026-01-01'")
# Aggregate inside Claude's code — only the summary enters context
summary = {
    "total_orders": len(results),
    "total_revenue": sum(r["amount"] for r in results),
    "avg_order_value": sum(r["amount"] for r in results) / len(results),
    "top_products": Counter(r["product"] for r in results).most_common(5)
}

Compounding Latency

Each tool call carries model inference overhead on top of execution time. In a 5-step workflow, you're paying for 5 inference passes plus the actual tool execution. With PTC, Claude generates code once, executes tools within that code, and returns results — typically requiring only 1–2 inference passes total.

Implementing PTC

Basic API Setup

To enable PTC, you include your tool definitions as usual and specify the appropriate beta flag in your request headers.

import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic();
 
// Tool definitions with Input Examples for better accuracy
const tools = [
  {
    name: "web_search",
    description: "Search the web for information",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string", description: "Search query" },
        max_results: { type: "number", description: "Max results to return" }
      },
      required: ["query"]
    },
    // Show Claude concrete usage patterns
    input_examples: [
      {
        query: "Claude API latest features 2026",
        max_results: 5
      }
    ]
  },
  {
    name: "fetch_page",
    description: "Fetch and extract text content from a URL",
    input_schema: {
      type: "object",
      properties: {
        url: { type: "string", description: "URL to fetch" }
      },
      required: ["url"]
    }
  }
];
 
const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 8192,
  tools: tools,
  tool_choice: { type: "auto" },
  messages: [
    {
      role: "user",
      content: "Research the latest Claude API updates and summarize the top 3 changes"
    }
  ]
});

Combining PTC with Tool Search

When you have thousands of registered tools, loading them all into context is wasteful. Tool Search lets Claude discover only the tools it needs on demand.

// Tool Search configuration
const toolSearchConfig = {
  name: "tool_search",
  description: "Search available tools by description or capability",
  input_schema: {
    type: "object",
    properties: {
      query: { type: "string", description: "What you need the tool to do" },
      max_results: { type: "number", default: 5 }
    },
    required: ["query"]
  }
};
 
// Claude dynamically discovers and uses tools:
// 1. tool_search("database query") → discovers query_database tool
// 2. tool_search("send email") → discovers send_email tool
// 3. PTC composes these into an automated workflow

Production Design Patterns

Pattern 1: Data Analysis Pipeline

# Example code Claude generates inside PTC
import json
 
# Parallel data fetching from multiple sources
sales_data = await tool.query_database(
    "SELECT product, SUM(amount) as total FROM sales GROUP BY product ORDER BY total DESC LIMIT 20"
)
market_data = await tool.fetch_api("/api/market-trends?period=30d")
competitor_data = await tool.web_search("competitor pricing analysis Q1 2026")
 
# Merge and process in code
analysis = {
    "top_products": sales_data[:5],
    "market_trend": market_data["trend_direction"],
    "growth_rate": market_data["growth_percentage"],
    "competitor_insights": [
        item["snippet"] for item in competitor_data[:3]
    ]
}
 
# Only the filtered result enters Claude's context
return json.dumps(analysis, ensure_ascii=False, indent=2)

Pattern 2: Multi-Step Verification Workflow

# Code review agent example
import asyncio
 
# Get changed files from PR
changed_files = await tool.github_get_pr_files(pr_number=1234)
 
# Run checks in parallel across all files
async def check_file(file_path):
    content = await tool.github_get_file_content(file_path)
    lint_result = await tool.run_linter(file_path, content)
    test_result = await tool.run_tests(file_path)
    return {
        "file": file_path,
        "lint_issues": len(lint_result.get("errors", [])),
        "test_passed": test_result["status"] == "pass",
        "critical": any(e["severity"] == "critical" for e in lint_result.get("errors", []))
    }
 
results = await asyncio.gather(*[check_file(f) for f in changed_files])
 
# Only files with issues enter Claude's context
issues = [r for r in results if r["lint_issues"] > 0 or not r["test_passed"]]
return json.dumps({"total_files": len(changed_files), "issues": issues})

Pattern 3: Resilient Error Handling with Retry

# Production-grade error handling inside PTC
import asyncio
 
MAX_RETRIES = 3
 
async def resilient_tool_call(tool_func, *args, **kwargs):
    """Tool call wrapper with exponential backoff"""
    for attempt in range(MAX_RETRIES):
        try:
            result = await tool_func(*args, **kwargs)
            return result
        except Exception as e:
            if attempt == MAX_RETRIES - 1:
                return {"error": str(e), "tool": tool_func.__name__}
            await asyncio.sleep(2 ** attempt)  # Exponential backoff
 
# Usage
search_result = await resilient_tool_call(tool.web_search, "Claude API updates")
if "error" in search_result:
    # Fallback: use cached data
    search_result = await tool.get_cached_data("claude_api_updates")

Input Examples: Teaching Claude How to Use Your Tools

JSON schemas define valid structure but can't convey usage patterns — when to include optional parameters, which combinations make sense, or what conventions your API expects. The input_examples field fills this gap with concrete demonstrations.

const databaseTool = {
  name: "query_database",
  description: "Execute SQL query against the application database",
  input_schema: {
    type: "object",
    properties: {
      query: { type: "string" },
      params: {
        type: "array",
        items: { type: "string" },
        description: "Parameterized query values (prevents SQL injection)"
      },
      timeout_ms: { type: "number", default: 5000 }
    },
    required: ["query"]
  },
  // Concrete usage patterns for Claude to learn from
  input_examples: [
    {
      // Basic parameterized query
      query: "SELECT name, email FROM users WHERE status = $1",
      params: ["active"]
    },
    {
      // Aggregation with extended timeout
      query: "SELECT DATE(created_at) as date, COUNT(*) as count FROM orders WHERE created_at > $1 GROUP BY DATE(created_at) ORDER BY date DESC",
      params: ["2026-01-01"],
      timeout_ms: 10000
    },
    {
      // Complex JOIN
      query: "SELECT u.name, COUNT(o.id) as order_count, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.created_at BETWEEN $1 AND $2 GROUP BY u.id, u.name HAVING COUNT(o.id) > $3",
      params: ["2026-01-01", "2026-03-31", "5"],
      timeout_ms: 15000
    }
  ]
};

From these examples, Claude learns to always use parameterized queries for injection safety, increase timeout_ms for heavy aggregations, and structure complex JOINs following your database conventions.

Performance Comparison: Traditional Tool Use vs PTC

Based on real-world benchmark data:

// Benchmark: 10-URL web research task
 
// Traditional Tool Use
// Round trips: 10 (1 search + 9 page fetches)
// Inference passes: 11
// Average latency: 45 seconds
// Token consumption: ~120,000 tokens
 
// PTC
// Round trips: 2 (1 code generation + 1 result return)
// Inference passes: 2
// Average latency: 8 seconds (parallel fetching)
// Token consumption: ~25,000 tokens
 
// BrowseComp benchmark results:
// Basic Tool Use: Score 42%
// PTC enabled:    Score 71% (+69% improvement)

Wrapping Up — PTC Changes How You Build AI Agents

Programmatic Tool Calling represents a fundamental shift in how you architect Claude API-powered agents.

  • Latency reduction: Up to 10x faster multi-tool workflows
  • Token savings: 70–80% cost reduction through in-code data filtering
  • Parallel execution: asyncio.gather patterns for simultaneous tool calls
  • Data processing: Shape and aggregate data before it enters context

Combined with Tool Search and Input Examples, PTC enables you to build production-grade AI agents that efficiently leverage thousands of tools and automate complex business logic.

Start by migrating an existing multi-step workflow to PTC — you'll immediately see the latency and cost improvements. For more details, check out the [Tool Use fundamentals guide]((/articles/api-sdk/tool-use-guide) and the [streaming implementation guide]((/articles/api-sdk/claude-api-streaming-tool-use).

Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

API & SDK2026-05-06
Building an Autonomous Research Agent with Claude API: Web Search, Summarization, and Knowledge Management
A complete guide to designing and implementing an autonomous research agent using Claude API and web search tools. Covers budget control, quality assurance, and knowledge base storage for production use.
API & SDK2026-04-25
Claude API × Tauri 2: Building a Production Desktop AI App With Rust — Streaming, Tool Use, and Signed Distribution
A complete guide to shipping a production-grade desktop AI app with Tauri 2 and the Claude API: keychain-backed key storage, an SSE streaming bridge in Rust, Tool Use, and macOS/Windows signed distribution — with code you can copy.
API & SDK2026-04-17
Building a GitHub PR Review Bot with Claude API — Complete Implementation from Webhooks to Security Scanning
Build a production-grade PR review bot using Claude API and GitHub Webhooks. Implement structured quality scoring, security scanning, and improvement suggestions using Tool Use. Covers rate limiting, cost management, and deployment.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →