CLAUDE LABJP
SANDBOX — Claude Managed Agents can now run in your own sandbox and connect to private MCP servers (self-hosted beta, MCP tunnels in preview)PLATFORM — The Claude Developer Platform adds new code execution, web search, and web fetch tools, exposing a 90-second per-cell limitCONTEXT — response_inclusion trims consumed result blocks to save context in agentic workflowsMCP — Enterprise-managed MCP connectors (Okta) continue: zero-touch access across Claude, Claude Code, and Cowork (Team/Enterprise beta)CODE — Claude Code adds /cd, a post-session hook, and a safe mode while tightening MCP policy enforcementMODEL — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; Fable 5 is available from Claude CodeSANDBOX — Claude Managed Agents can now run in your own sandbox and connect to private MCP servers (self-hosted beta, MCP tunnels in preview)PLATFORM — The Claude Developer Platform adds new code execution, web search, and web fetch tools, exposing a 90-second per-cell limitCONTEXT — response_inclusion trims consumed result blocks to save context in agentic workflowsMCP — Enterprise-managed MCP connectors (Okta) continue: zero-touch access across Claude, Claude Code, and Cowork (Team/Enterprise beta)CODE — Claude Code adds /cd, a post-session hook, and a safe mode while tightening MCP policy enforcementMODEL — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; Fable 5 is available from Claude Code
Articles/API & SDK
API & SDK/2026-06-21Advanced

Connecting Managed Agents to Services You Don't Want to Expose: MCP Tunnel Design

How to connect Claude Managed Agents to an internal MCP server that is never exposed to the public internet. We cover the MCP tunnel, self-hosted sandboxes, authorization boundaries, and graceful degradation when things break.

Claude39MCP33Managed AgentsSecurity2API23

The moment you try to build an agent that queries your internal inventory database, you usually stop at one question: you do not want that database reachable from outside. Managed Agents are convenient, but as long as a tool points at a public endpoint, your internal services end up behind a VPN or a bastion host, and operations get heavy fast.

The June 2026 update lets Managed Agents connect to your own sandboxes and to private MCP servers. Self-hosted sandboxes are in public beta on the Claude Platform, and the MCP tunnel that reaches an internal MCP server is in research preview. This article walks through the connection design for letting an agent use services you would rather not publish, along with the authorization and failure-handling decisions that actually trip people up.

Where "not exposed" is actually enforced

Let me clear up one easy misconception first. The MCP tunnel is not a mechanism for publishing your internal server to the internet. The idea runs the other way: your internal side opens a single outbound connection, and the agent's tool calls travel back only along that path. The key benefit is that you never open an inbound port.

The boundaries line up like this.

LayerExposureWhat it protects
SandboxSelf-hosted (you manage it)Isolation of code execution; limited egress
MCP tunnelOutbound onlyThe internal server staying private
MCP serverReachable only via the tunnelTool authorization and scope
Backend (DB, etc.)Only from the MCP serverAccess control to real data

As an indie developer, I take the same stance even when aggregating app revenue. The internal aggregation API is never public; a local job reaches it outbound. Once you decide that what you hand the agent is not a "public endpoint" but a "broker that holds an outbound path," the design stops drifting.

A minimal private MCP server

Start with the MCP server you place on the internal side. Here it exposes a single tool: an inventory lookup. Remember that the tunnel does not let the outside reach this server; the server reaches out.

# inventory_mcp.py — runs only inside the internal network
from mcp.server.fastmcp import FastMCP
import os
import asyncpg
 
mcp = FastMCP("inventory")
 
# Every target is an internal address. No public IP at all.
DB_DSN = os.environ["INTERNAL_DB_DSN"]  # e.g. postgres://app@10.0.3.12:5432/inventory
 
@mcp.tool()
async def check_stock(sku: str) -> dict:
    """Return the quantity on hand and next restock date for a SKU."""
    if not sku.isalnum():
        # Always validate input on the tool side. Do not trust agent output.
        raise ValueError("sku must be alphanumeric")
 
    conn = await asyncpg.connect(DB_DSN)
    try:
        row = await conn.fetchrow(
            "SELECT quantity, restock_date FROM stock WHERE sku = $1",
            sku,
        )
    finally:
        await conn.close()
 
    if row is None:
        return {"sku": sku, "found": False}
    return {
        "sku": sku,
        "found": True,
        "quantity": row["quantity"],
        "restock_date": row["restock_date"].isoformat() if row["restock_date"] else None,
    }
 
if __name__ == "__main__":
    mcp.run()

What I do deliberately here is put input validation for check_stock on the tool side. The arguments an agent generates deserve the same suspicion as input from outside. Passing the SKU through a placeholder rather than into SQL directly comes from the same instinct. When you treat the MCP server as the last checkpoint rather than a convenient extension of the agent, your permission design tightens up.

Open the tunnel and let the agent reach it

Next, make the server reachable to the agent. With the research-preview MCP tunnel, you start a tunnel client on the internal side that connects outbound to the Claude Platform. The server is assigned an identifier (a tunnel ID), and the agent references it as an MCP connector.

# Run on an internal host. No inbound port is ever opened.
export CLAUDE_TUNNEL_TOKEN="YOUR_TUNNEL_TOKEN"
claude-tunnel connect \
  --target stdio:./inventory_mcp.py \
  --name internal-inventory
# → prints the assigned tunnel id (e.g. tnl_internal_inventory)

On the agent side, you pass that identifier as an MCP server definition. The important part is that the agent config holds no DB credentials and no internal IPs. All the agent should know is that a tool group called internal-inventory exists.

# agent_config.py — agent side. It holds none of the internal details.
agent = client.beta.agents.create(
    model="claude-opus-4-8",
    name="inventory-assistant",
    instructions=(
        "Use the check_stock tool for inventory questions. "
        "If the tool returns found=false, do not guess a quantity."
    ),
    mcp_servers=[
        {
            "type": "tunnel",
            "tunnel_id": "tnl_internal_inventory",
            "tool_allowlist": ["check_stock"],  # explicitly narrow what can be used
        }
    ],
    sandbox={"type": "self_hosted", "id": "sbx_team_default"},
)

I recommend always setting tool_allowlist. If the MCP server grows another tool later, the allowlist prevents the agent from picking it up by accident. Manage the list additively, starting from a state where nothing is usable by default.

Think about authorization in two stages

The hardest thing in production is authorization. If you settle it in a single stage—"the tunnel is connected, so we're fine"—it will break later. In my experience, splitting it into two stages keeps it clear.

The first stage is "who may invoke this agent." That lives at the agent's entry point, controlled by API keys or user sessions. The second stage is "what range of data this tool call may touch." That is decided on the MCP server, looking at a scope tied to the call context.

# Add authorization to inventory_mcp.py
from mcp.server.fastmcp import Context
 
@mcp.tool()
async def check_stock(sku: str, ctx: Context) -> dict:
    # Pull the tenant from metadata carried over the tunnel
    tenant = ctx.request_context.meta.get("tenant")
    if tenant is None:
        raise PermissionError("Calls without tenant context are rejected")
 
    conn = await asyncpg.connect(DB_DSN)
    try:
        row = await conn.fetchrow(
            "SELECT quantity, restock_date FROM stock "
            "WHERE sku = $1 AND tenant_id = $2",  # enforce the tenant boundary in SQL
            sku, tenant,
        )
    finally:
        await conn.close()
    # the rest matches the earlier version

The point is to enforce the tenant boundary as a SQL WHERE clause, not as an instruction to the agent. I treat a note in the prompt as something that may or may not be honored. A data boundary is best closed physically, in a layer below the prompt.

What happens when the tunnel drops

A research-preview tunnel can drop on network jitter. If you have not decided the degradation behavior, the agent confuses "I could not check stock" with "stock was zero," and returns a wrong answer.

There are two safeguards. One is to treat a tool failure explicitly as "unknown" and forbid guessing. The other is to give the tunnel client automatic reconnection and a health check.

# A thin wrapper around the tunnel client. Reconnect with exponential backoff.
import asyncio
 
async def run_tunnel_with_retry():
    backoff = 1
    while True:
        try:
            await start_tunnel(target="stdio:./inventory_mcp.py",
                               name="internal-inventory")
            backoff = 1  # reset on a successful connection
        except TunnelDisconnected as e:
            # Treat a disconnect as an expected event, not an anomaly
            wait = min(backoff, 30)
            log.warning("tunnel disconnected: %s — reconnecting in %ds", e, wait)
            await asyncio.sleep(wait)
            backoff *= 2

The instruction "do not guess a quantity if the tool fails" exists precisely for this disconnect. The worst pattern is an agent that reads the room and produces a plausible number during the tens of seconds it takes to reconnect. You need a state, designed in advance, where the agent can fall silent and say "I can't confirm that right now."

Why pair this with a self-hosted sandbox

Finally, a word on why you would also use a self-hosted sandbox. Where the tunnel closes the "path to the internal server," the self-hosted sandbox puts the "place where the agent runs code" under your management. They protect different things.

Suppose that after fetching inventory data, the agent writes and runs code to aggregate it. If that execution happens in a sandbox you manage rather than a shared one, the internal data you fetched stays inside your management boundary, execution environment and all. The tunnel closes the data's "entrance," and the sandbox closes the data's "place of processing"—a boundary closed twice.

The data I handle as an indie developer is never large, yet I still care about staying able to explain where code runs and where data flows out. Losing sight of those paths in exchange for convenience is the thing I most want to avoid over a long-running operation.

Between being able to connect and connecting safely sit the design layers of authorization, degradation, and isolation. I would start with a single small tool like an inventory lookup and build the whole loop—tunnel, allowlist, tenant boundary, reconnection—by hand once. Once you have a minimal working setup, you can extend the same shape as more services come online.

Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

If you found this article helpful, a small tip ($1.50) would mean a lot to us. Your support helps keep this site ad-free and covers server and hosting costs.

Related Articles

API & SDK2026-05-30
Catching Claude Quality Regressions With an Eval Harness
I tweaked a prompt by one line and, for a different set of inputs, the output quietly got worse. Here is the eval harness I built to protect Claude's production quality across every prompt change and model update, with full implementation code and real operating numbers.
API & SDK2026-04-26
Reading Claude API stop_reason Correctly — A Production Guide to end_turn, max_tokens, pause_turn, and refusal
Branching on Claude API's stop_reason properly eliminates a surprising number of production incidents — truncated outputs, missed tool continuations, wasted retries. Here is how to tell end_turn, max_tokens, pause_turn, and refusal apart.
API & SDK2026-04-23
Production Prompt-Injection Defense for the Claude API — Detection, Sanitization, and Layered Guardrails
A practical, code-first design guide for defending Claude API applications against prompt injection — covering input sanitization, channel separation, output validation, and red-teaming for long-term safety.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →