●CONFERENCE — Code w/ Claude, the annual developer conference, kicked off June 22 with keynotes, sessions, and workshops●LIMITS — Claude Code rate limits doubled and Opus API limits rose, making it easier to build reliably at scale●DESIGN — Claude Design updates add design-system alignment, tighter Claude Code sync, and direct canvas editing●SANDBOX — Claude Managed Agents now run in your own sandbox and connect to private MCP servers●MODEL — Claude Fable 5 offers a 1M-token context, always-on adaptive thinking, and 128K output●LINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per task●CONFERENCE — Code w/ Claude, the annual developer conference, kicked off June 22 with keynotes, sessions, and workshops●LIMITS — Claude Code rate limits doubled and Opus API limits rose, making it easier to build reliably at scale●DESIGN — Claude Design updates add design-system alignment, tighter Claude Code sync, and direct canvas editing●SANDBOX — Claude Managed Agents now run in your own sandbox and connect to private MCP servers●MODEL — Claude Fable 5 offers a 1M-token context, always-on adaptive thinking, and 128K output●LINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per task
Reach a Remote MCP Server in a Single API Request: Implementing the Messages API MCP Connector
How to call a remote MCP server's tools using only the Messages API's mcp_servers and mcp_toolset—no local MCP client. Covers allowlist/denylist design, response handling, and the pitfalls to avoid before unattended production use.
You only want a scheduled job to use a remote MCP server—yet before that, you have to keep a local MCP client running, open a transport, and manage its lifecycle. That "scaffolding to call one tool" quietly weighs on headless operations. As an indie developer, I update several sites at fixed times every day, and every time I tucked an MCP client inside that runner, the logs filled up with disconnects and process cleanup.
The Messages API's MCP connector removes that scaffolding entirely. You put a server URL into the request's mcp_servers, and Anthropic's side connects to that remote MCP server, runs the tools, and folds the results into the same response. The point is simple: no client implementation required.
When it helps, and when it doesn't
Let me draw the line first. The MCP connector isn't a universal answer—its fit is clear-cut.
As the documentation states, the only part of the MCP spec currently supported is tool calls. Prompts and resources are out of scope; if you need those, you still have to run your own MCP client. Connections are limited to servers exposed over HTTPS (Streamable HTTP / SSE); local stdio servers can't be connected directly.
My rule of thumb: if you just want to call tools on a URL-reachable remote server, the connector is the shortest path; if you need local servers, prompts, resources, or fine connection control, run your own client. This sits next to the question of how to safely reach a private internal server, covered in MCP tunnel design for Managed Agents—a different answer to the same question of where you draw the boundary of your execution environment.
What you want
The right tool
Just call tools on a URL-exposed remote MCP
MCP connector (mcp_servers)
Use a local stdio server
Your own MCP client + SDK helpers
Use MCP prompts / resources
Your own MCP client
Reach a private internal service
MCP tunnel (Managed Agents)
The minimal call
Here's the smallest form: enabling all tools on a single server. Two things matter—connection details go in mcp_servers, and which tools to use go in tools as an mcp_toolset. And you need the beta header.
import anthropicclient = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from the environmentresponse = client.beta.messages.create( model="claude-opus-4-8", max_tokens=1024, messages=[ {"role": "user", "content": "List the tools you have, then fetch today's events."} ], # 1) Connection definition (URL and token) mcp_servers=[ { "type": "url", "url": "https://example-server.modelcontextprotocol.io/sse", "name": "calendar", "authorization_token": "YOUR_ACCESS_TOKEN", # omit for servers that don't need it } ], # 2) Which tools to expose (all tools enabled by default) tools=[{"type": "mcp_toolset", "mcp_server_name": "calendar"}], # 3) Beta header (the old mcp-client-2025-04-04 is deprecated) betas=["mcp-client-2025-11-20"],)for block in response.content: print(block.type)
Expected output (excerpt): blocks arrive in the order text → mcp_tool_use → mcp_tool_result → text. When Claude decides it needs the events, the tool is invoked once, and a final text grounded in that result follows.
There's an early pitfall here. If you forget betas=["mcp-client-2025-11-20"], you get an ordinary response as if mcp_servers were ignored entirely. I missed this at first and spent half an hour wondering why "the tool never gets called." The version bumped and the header name changed from mcp-client-2025-04-04, so pasting code from older posts fails silently.
✦
Thank you for reading this far.
Continue Reading
What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.
WHAT YOU'LL LEARN
✦You'll be able to call a remote MCP server's tools from a single API request, without keeping a local MCP client running
✦You'll learn to scope tool exposure with allowlist / denylist / defer_loading so unattended jobs never call a destructive tool by accident
✦You'll sidestep the pitfalls people hit before going live—mcp_tool_use / mcp_tool_result handling, OAuth token management, and the lack of ZDR coverage
Secure payment via Stripe · Cancel anytime
✦
Unlock This Article
Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.
With the MCP connector, two unfamiliar content block types come back. Tool execution completes on Anthropic's side, so you don't send results back—you just read and use them.
def walk(response): for block in response.content: if block.type == "text": print("TEXT:", block.text) elif block.type == "mcp_tool_use": # A tool Claude called; server_name tells you which server print(f"CALL: {block.server_name}.{block.name}({block.input})") elif block.type == "mcp_tool_result": # The execution result; check is_error for failures status = "ERROR" if block.is_error else "OK" texts = [c.text for c in block.content if c.type == "text"] print(f"RESULT[{status}] {block.tool_use_id}: {' '.join(texts)}")walk(response)
Always inspect is_error on mcp_tool_result. Even if a remote tool fails, the API request itself returns 200. In unattended operation, the scariest state is "the request succeeded but the tool failed," so wire is_error straight into your logs and alerts from day one—it makes root-causing far easier later.
Allowlist / denylist design for unattended jobs
This is the part I most want to get across. If you leave every tool open through the connector, Claude can—depending on context—call write or delete tools too. An interactive chat lets you insert a confirmation step; a scheduled job has no such room.
The new toolset lets you decide tool exposure precisely by combining default_config and configs. For unattended jobs I use an allowlist: close everything by default, then open only the read-only tools.
read_only_toolset = { "type": "mcp_toolset", "mcp_server_name": "calendar", "default_config": {"enabled": False}, # disable all tools first "configs": { "search_events": {"enabled": True}, # open only the reads you need "list_events": {"enabled": True}, },}
Conversely, when you want "almost everything, minus the dangerous ones," use a denylist. It suits read-centric assistants, or cases where you want a human confirmation before any state change.
guarded_toolset = { "type": "mcp_toolset", "mcp_server_name": "calendar", # default_config omitted (= all tools enabled by default) "configs": { "delete_all_events": {"enabled": False}, # close destructive ops "share_calendar_publicly": {"enabled": False}, },}
My operating principle is plain: unattended jobs default to allowlist; only interactive tooling needs a denylist. The reasoning is the asymmetry of a forgotten entry—under an allowlist a slip means "a tool is missing" (harmless), while under a denylist it means "a dangerous tool was left open." How tightly you scope tool permissions for unattended runs pairs well with the "hold authorization in one place" idea I covered in enforcing MCP policy for unattended agents.
The merge order is worth remembering so you don't second-guess yourself: highest to lowest, "per-tool settings in configs → default_config → system defaults." You can lay down defer_loading: true in default_config while overriding just one tool to defer_loading: false in configs.
Multiple servers, and defer_loading as tools grow
You can connect to several MCP servers in one request. List each in mcp_servers and place a matching mcp_toolset in tools. A validation rule to note: each defined server must be referenced by exactly one toolset, or you get an error (a safe-by-default rule that won't let you leave an unused server lying around).
response = client.beta.messages.create( model="claude-opus-4-8", max_tokens=1024, messages=[{"role": "user", "content": "Cross-check events and tasks; tell me which time blocks look overloaded today."}], mcp_servers=[ {"type": "url", "url": "https://mcp.example1.com/sse", "name": "calendar", "authorization_token": "TOKEN1"}, {"type": "url", "url": "https://mcp.example2.com/sse", "name": "tasks", "authorization_token": "TOKEN2"}, ], tools=[ {"type": "mcp_toolset", "mcp_server_name": "calendar"}, { "type": "mcp_toolset", "mcp_server_name": "tasks", "default_config": {"defer_loading": True}, # don't send descriptions up front }, ], betas=["mcp-client-2025-11-20"],)
Once tools number in the dozens across servers, their descriptions alone eat context and selection accuracy drops. Laying down defer_loading: true keeps tool descriptions from being sent to the model initially, and combined with the Tool search tool, only the relevant tools surface per query. In my experience, the moment tool counts cross into double digits is the cue to turn on defer_loading.
Cost and caching
Tool calls through the MCP connector are priced the same as ordinary Messages API tool use. You can also include mcp_servers in Batches API requests, and MCP tool calls via batches are priced identically. I push "not urgent but high volume" work—like news gathering—into batches, and it was reassuring that mixing in a remote MCP doesn't change the billing model.
You can attach cache_control to a toolset to place a prompt-caching breakpoint. Tool definitions are a stable, slow-changing part, so they belong on the "unchanging side" of a split-TTL caching design. That's exactly the approach I laid out in splitting prompt caching into 5m and 1h tiers—MCP tool definitions are precisely the block you want on the 1h side.
What to verify before going live
Before integrating, here's the checklist I run every time.
First, no ZDR (Zero Data Retention). The MCP connector is excluded from zero data retention; tool definitions and execution results are kept under the standard retention policy. If you have a ZDR-based requirement, you can't use this feature.
Second, platform differences. The MCP connector works on the Claude API, the Claude Platform on AWS, and Microsoft Foundry, but is not currently available on Amazon Bedrock or Vertex AI. It matters when choosing where you deploy, so confirm it early to avoid rework.
For servers requiring OAuth, the access token passed to authorization_token is obtained and refreshed by you, the caller. Token lifetime is your responsibility, so for scheduled jobs I build a refresh that anticipates expiry into the front of the job. If you just need a token for testing, the OAuth flow in npx @modelcontextprotocol/inspector issues one quickly.
Next step
If you have a URL-reachable remote MCP server on hand, send just one request in the minimal all-tools form and watch with your own eyes how mcp_tool_use and mcp_tool_result actually line up. Once the shape of the response clicks, tightening it down with an allowlist goes surprisingly smoothly from there. For me, once I felt this "the scaffolding is gone" relief once, I started peeling unnecessary resident processes out of my headless jobs one by one.
Share
Thank You for Reading
Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.