CLAUDE LABJP
SLACK — Claude Tag launches in beta on Slack: tag @Claude into channels to delegate tasks and connect tools, data, and codebasesSECURITY — Claude Code adds a sandbox.credentials setting to block sandboxed commands from reading credential files and secretsFIX — Remote MCP tool calls that once hung for five minutes now abort with an error instead of blockingMCP — Enterprise MCP connectors gain Okta provisioning, giving users zero-touch access on first loginMODEL — Claude Fable 5 offers a 1M-token context, always-on adaptive thinking, and 128K outputLINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per taskSLACK — Claude Tag launches in beta on Slack: tag @Claude into channels to delegate tasks and connect tools, data, and codebasesSECURITY — Claude Code adds a sandbox.credentials setting to block sandboxed commands from reading credential files and secretsFIX — Remote MCP tool calls that once hung for five minutes now abort with an error instead of blockingMCP — Enterprise MCP connectors gain Okta provisioning, giving users zero-touch access on first loginMODEL — Claude Fable 5 offers a 1M-token context, always-on adaptive thinking, and 128K outputLINEUP — Opus 4.8, Sonnet 4.6, and Haiku 4.5 lead the lineup; pick the right one per task
Articles/API & SDK
API & SDK/2026-06-25Advanced

When the Previous Run Hasn't Finished and the Next One Starts: Leases and Fencing Tokens for Scheduled Agents

A scheduled agent that runs on a fixed clock can overtake itself and start twice. From the moment a naive lock breaks to leases, fencing tokens, and bounded catch-up — worked through with the implementation I actually run.

Claude Agent SDK10Scheduled jobsDistributed locksProduction21Design6

Premium Article

"The previous job is still running, and the next one has already started." I noticed this one morning while watching the system that generates articles for the four Dolice Labs sites every day on a fixed schedule.

The trigger was a rate-limit increase for Claude Code. With more headroom, I tightened an interval I had kept generously wide, down to 45 minutes. Right after, one generation ran long, and while the previous run was still cleaning up, the next cron had already spun up a fresh process. Two articles were about to be pushed, and I froze looking at the git history.

Scheduled execution looks like a simple "run when the clock says so," but it carries a weak spot at the exact moment it passes its former self. Today I want to walk from where a naive lock breaks to fencing tokens, alongside the implementation I keep in place as an indie developer.

Why the next run overtakes the previous one

A cron entry, or a Cowork schedule, only promises one thing: start at this time. Nowhere does it promise start after the previous run finishes.

Most days, generation is comfortably faster than the interval, so the two never meet. In production, though, a transient API delay, a retry, an unusually large article, a network hiccup — any one of them lets a single run eat through its interval.

When that happens, the next trigger fires without hesitation. Both runs clone the same repository, reach for the same slug, and try to write the same file. One succeeds at git push; the other collides on rebase. On a bad day, two slightly different articles are born.

The heart of the problem is that neither run knows the other exists. So the first thing we need is a way for them to announce themselves to each other.

Start with a naive lock (and watch it break)

The obvious move is to raise a flag at the start and lower it at the end.

// Naive version — this breaks in production
async function runOnce(store, jobId, body) {
  if (await store.read(jobId)) {
    return { ran: false, reason: "locked" };
  }
  await store.write(jobId, { running: true });
  try {
    await body();
  } finally {
    await store.write(jobId, null); // release
  }
}

For short jobs this prevents most overlaps, and this is where I started too. But two holes remain.

The first is a lock that never releases. If the run crashes, or the VM itself dies, it never reaches finally. The flag stays raised, and from the next day every run is rejected as "locked." On an unattended system, you notice days later. I lived through a similar freeze on a job that fetched AdMob reports every morning.

The second is nastier. A run that supposedly holds the lock may already be dead while the OS still thinks it is alive — or a long-paused process may wake up after everyone has given up on it and execute only its final write. The presence of a flag cannot stop this "zombie."

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
A minimal TTL-based lease lock (TypeScript, any CAS store) that stops a second run when generation outlives the interval
How to stop a late, expired run from writing — by validating the fencing token at the write target, with the gotchas
A bounded catch-up policy that collapses missed slots to one, with the real overlap-skip rate (~2% of runs)
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

API & SDK2026-06-02
Guard Your Agent's Destructive Operations with Pre- and Post-condition Contracts
A design for wrapping an autonomous agent's writes in deterministic pre- and post-condition checks. A contract gate stops the destructive operations that better prompts can never reliably prevent.
API & SDK2026-05-20
Resolving Tool Name Collisions When Bundling Multiple MCP Servers in the Claude Agent SDK
When the GitHub MCP and Linear MCP both expose create_issue, Sonnet 4.6 cannot tell them apart. This article walks through the structure of MCP tool name collisions, a TypeScript reconciler implementation, and the production failure modes I hit running six sites at once.
API & SDK2026-04-24
Giving Claude Agents Long-Term Memory in Production — Seven Pitfalls and the Patterns That Fix Them
A production playbook for Claude agents with long-term memory — seven pitfalls that break memory agents live, and the design patterns that fix each one.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →