What Is Programmatic Tool Calling (PTC)?
When building AI agents with the Claude API, tool call latency and token costs have always been significant challenges. Traditional tool use requires Claude to request one tool at a time, wait for the result, then request the next — creating costly round trips. A five-tool workflow means five separate model inference passes.
Programmatic Tool Calling (PTC) solves this from the ground up. Instead of orchestrating tools through natural language, Claude writes Python code inside a sandboxed execution environment to call multiple tools programmatically, minimizing round trips and giving you precise control over what enters the context window.
# Traditional Tool Use: 5 round trips
# Tool 1 → result → Tool 2 → result → Tool 3 → result → ...
# PTC: Claude writes code to orchestrate everything
async def research_workflow(query):
# Claude calls tools directly within generated code
search_results = await tool.web_search(query)
relevant_urls = [r["url"] for r in search_results[:5]]
# Parallel page fetching
pages = await asyncio.gather(*[
tool.fetch_page(url) for url in relevant_urls
])
# Filter in code — only summaries enter Claude's context
summaries = [extract_key_points(page) for page in pages]
return summariesThe result: up to 10x latency improvement and dramatically lower token consumption for multi-tool workflows.
Why PTC Matters — The Limits of Traditional Tool Use
The Context Window Problem
With traditional tool use, every tool call result enters Claude's context window in full. For data-heavy workflows, intermediate results can crowd out the information that actually matters.
# Traditional problem: all data enters context
# Fetch 1000 rows from database → all 1000 rows in context
# → you only needed the aggregate, but tokens are already spent
# PTC solution: filter in code
results = await tool.query_database("SELECT * FROM orders WHERE date > '2026-01-01'")
# Aggregate inside Claude's code — only the summary enters context
summary = {
"total_orders": len(results),
"total_revenue": sum(r["amount"] for r in results),
"avg_order_value": sum(r["amount"] for r in results) / len(results),
"top_products": Counter(r["product"] for r in results).most_common(5)
}Compounding Latency
Each tool call carries model inference overhead on top of execution time. In a 5-step workflow, you're paying for 5 inference passes plus the actual tool execution. With PTC, Claude generates code once, executes tools within that code, and returns results — typically requiring only 1–2 inference passes total.
Implementing PTC
Basic API Setup
To enable PTC, you include your tool definitions as usual and specify the appropriate beta flag in your request headers.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
// Tool definitions with Input Examples for better accuracy
const tools = [
{
name: "web_search",
description: "Search the web for information",
input_schema: {
type: "object",
properties: {
query: { type: "string", description: "Search query" },
max_results: { type: "number", description: "Max results to return" }
},
required: ["query"]
},
// Show Claude concrete usage patterns
input_examples: [
{
query: "Claude API latest features 2026",
max_results: 5
}
]
},
{
name: "fetch_page",
description: "Fetch and extract text content from a URL",
input_schema: {
type: "object",
properties: {
url: { type: "string", description: "URL to fetch" }
},
required: ["url"]
}
}
];
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 8192,
tools: tools,
tool_choice: { type: "auto" },
messages: [
{
role: "user",
content: "Research the latest Claude API updates and summarize the top 3 changes"
}
]
});Combining PTC with Tool Search
When you have thousands of registered tools, loading them all into context is wasteful. Tool Search lets Claude discover only the tools it needs on demand.
// Tool Search configuration
const toolSearchConfig = {
name: "tool_search",
description: "Search available tools by description or capability",
input_schema: {
type: "object",
properties: {
query: { type: "string", description: "What you need the tool to do" },
max_results: { type: "number", default: 5 }
},
required: ["query"]
}
};
// Claude dynamically discovers and uses tools:
// 1. tool_search("database query") → discovers query_database tool
// 2. tool_search("send email") → discovers send_email tool
// 3. PTC composes these into an automated workflowProduction Design Patterns
Pattern 1: Data Analysis Pipeline
# Example code Claude generates inside PTC
import json
# Parallel data fetching from multiple sources
sales_data = await tool.query_database(
"SELECT product, SUM(amount) as total FROM sales GROUP BY product ORDER BY total DESC LIMIT 20"
)
market_data = await tool.fetch_api("/api/market-trends?period=30d")
competitor_data = await tool.web_search("competitor pricing analysis Q1 2026")
# Merge and process in code
analysis = {
"top_products": sales_data[:5],
"market_trend": market_data["trend_direction"],
"growth_rate": market_data["growth_percentage"],
"competitor_insights": [
item["snippet"] for item in competitor_data[:3]
]
}
# Only the filtered result enters Claude's context
return json.dumps(analysis, ensure_ascii=False, indent=2)Pattern 2: Multi-Step Verification Workflow
# Code review agent example
import asyncio
# Get changed files from PR
changed_files = await tool.github_get_pr_files(pr_number=1234)
# Run checks in parallel across all files
async def check_file(file_path):
content = await tool.github_get_file_content(file_path)
lint_result = await tool.run_linter(file_path, content)
test_result = await tool.run_tests(file_path)
return {
"file": file_path,
"lint_issues": len(lint_result.get("errors", [])),
"test_passed": test_result["status"] == "pass",
"critical": any(e["severity"] == "critical" for e in lint_result.get("errors", []))
}
results = await asyncio.gather(*[check_file(f) for f in changed_files])
# Only files with issues enter Claude's context
issues = [r for r in results if r["lint_issues"] > 0 or not r["test_passed"]]
return json.dumps({"total_files": len(changed_files), "issues": issues})Pattern 3: Resilient Error Handling with Retry
# Production-grade error handling inside PTC
import asyncio
MAX_RETRIES = 3
async def resilient_tool_call(tool_func, *args, **kwargs):
"""Tool call wrapper with exponential backoff"""
for attempt in range(MAX_RETRIES):
try:
result = await tool_func(*args, **kwargs)
return result
except Exception as e:
if attempt == MAX_RETRIES - 1:
return {"error": str(e), "tool": tool_func.__name__}
await asyncio.sleep(2 ** attempt) # Exponential backoff
# Usage
search_result = await resilient_tool_call(tool.web_search, "Claude API updates")
if "error" in search_result:
# Fallback: use cached data
search_result = await tool.get_cached_data("claude_api_updates")Input Examples: Teaching Claude How to Use Your Tools
JSON schemas define valid structure but can't convey usage patterns — when to include optional parameters, which combinations make sense, or what conventions your API expects. The input_examples field fills this gap with concrete demonstrations.
const databaseTool = {
name: "query_database",
description: "Execute SQL query against the application database",
input_schema: {
type: "object",
properties: {
query: { type: "string" },
params: {
type: "array",
items: { type: "string" },
description: "Parameterized query values (prevents SQL injection)"
},
timeout_ms: { type: "number", default: 5000 }
},
required: ["query"]
},
// Concrete usage patterns for Claude to learn from
input_examples: [
{
// Basic parameterized query
query: "SELECT name, email FROM users WHERE status = $1",
params: ["active"]
},
{
// Aggregation with extended timeout
query: "SELECT DATE(created_at) as date, COUNT(*) as count FROM orders WHERE created_at > $1 GROUP BY DATE(created_at) ORDER BY date DESC",
params: ["2026-01-01"],
timeout_ms: 10000
},
{
// Complex JOIN
query: "SELECT u.name, COUNT(o.id) as order_count, SUM(o.amount) as total FROM users u JOIN orders o ON u.id = o.user_id WHERE o.created_at BETWEEN $1 AND $2 GROUP BY u.id, u.name HAVING COUNT(o.id) > $3",
params: ["2026-01-01", "2026-03-31", "5"],
timeout_ms: 15000
}
]
};From these examples, Claude learns to always use parameterized queries for injection safety, increase timeout_ms for heavy aggregations, and structure complex JOINs following your database conventions.
Performance Comparison: Traditional Tool Use vs PTC
Based on real-world benchmark data:
// Benchmark: 10-URL web research task
// Traditional Tool Use
// Round trips: 10 (1 search + 9 page fetches)
// Inference passes: 11
// Average latency: 45 seconds
// Token consumption: ~120,000 tokens
// PTC
// Round trips: 2 (1 code generation + 1 result return)
// Inference passes: 2
// Average latency: 8 seconds (parallel fetching)
// Token consumption: ~25,000 tokens
// BrowseComp benchmark results:
// Basic Tool Use: Score 42%
// PTC enabled: Score 71% (+69% improvement)Wrapping Up — PTC Changes How You Build AI Agents
Programmatic Tool Calling represents a fundamental shift in how you architect Claude API-powered agents.
- Latency reduction: Up to 10x faster multi-tool workflows
- Token savings: 70–80% cost reduction through in-code data filtering
- Parallel execution:
asyncio.gatherpatterns for simultaneous tool calls - Data processing: Shape and aggregate data before it enters context
Combined with Tool Search and Input Examples, PTC enables you to build production-grade AI agents that efficiently leverage thousands of tools and automate complex business logic.
Start by migrating an existing multi-step workflow to PTC — you'll immediately see the latency and cost improvements. For more details, check out the [Tool Use fundamentals guide]((/articles/api-sdk/tool-use-guide) and the [streaming implementation guide]((/articles/api-sdk/claude-api-streaming-tool-use).