◈ Cowork/2026-07-01Advanced

Let the Downstream Task Verify the Upstream Actually Ran Today: A Completion Ledger and Dependency Barrier for Unattended Schedulers

Unattended schedulers have no notion of dependencies, so when a morning data-refresh task fails silently, the noon generation task keeps running on yesterday's leftovers. This is a design for recording upstream completion atomically and having downstream assert its preconditions before running, with working TypeScript and lessons from my own operations.

Cowork³¹ Scheduled Tasks⁸ Automation³³ Dependencies Reliability⁵

✦ Premium Article

One morning, the 7 a.m. reference-data refresh task died partway through on a brief network blip. Nothing unusual landed in the logs; yesterday's file simply stayed in place. The real damage started next. The noon article-generation task read that reference data without complaint, and produced a "new" article from the same news as the day before, then published it. No one noticed an error, and only staler and staler output kept piling up. This is the quietest and most insidious shape of failure in unattended scheduling.

The root of this incident is not a bug in the code. It is a structural problem: most schedulers have no concept of "run task B only after task A finishes." Each task simply fires independently at its appointed time, and the downstream has no idea whether the upstream truly ran today. So the downstream starts on an implicit assumption that "it probably ran." This article replaces that implicit assumption with an explicit check, using two parts: a completion ledger and a dependency barrier.

Why "it probably ran" is dangerous — the overlooked third state

When we reason about dependencies by hand, we tend to think in two values: the upstream either "succeeded" or "failed." But in unattended operation, the state that really bites is a third one that most implementations drop.

State	Meaning	What downstream should do
not-run	Upstream hasn't run today, or died partway	Halt (do not produce degraded output)
ran-empty	Upstream ran fine but legitimately produced nothing today	Skip (this is not an error)
ran-produced	Upstream ran and produced the artifact downstream expects	Proceed

The trouble is that many implementations cannot tell "ran-empty" apart from "not-run." If the downstream only checks "does the reference file exist," yesterday's leftover exists, so it mistakes "not-run" for "ran-produced." Conversely, if it only checks "was the file updated today," it mistakes a day when the upstream legitimately returned nothing (say, a day with no new news) for an anomaly and fires needless alerts. Cleanly separating the three states is where every dependency barrier begins.

Running automated publishing across several sites, I once treated a legitimate "empty" as an anomaly and halted the downstream, which left the whole pipeline spinning uselessly on what should have been a quiet holiday. The cause was that the upstream's intent — returning empty was a normal outcome — never reached the downstream. So the ledger needs vocabulary that can express an "intended empty," not just success and failure.

The ledger data structure

The heart of a dependency barrier is a small ledger where each task records the outcome of its daily run. You do not need a heavyweight job queue; an append-style JSON file with one record per line does the job. First, pin the vocabulary down with types.

// ledger.ts
export type RunStatus = "ok" | "empty" | "error";
 
export interface RunRecord {
  task: string;            // e.g. "daily-reference-and-ticker"
  date: string;            // JST YYYY-MM-DD (the source of truth for day boundaries)
  status: RunStatus;       // ok=produced / empty=legitimate empty / error=failed
  fingerprint: string | null; // content hash of the artifact; null for empty/error
  artifactPath: string | null;
  finishedAt: string;      // ISO8601 (for auditing)
  note?: string;           // human-readable reason for empty or failure
}

Keeping status and fingerprint as separate fields is the key. status: "empty" means "the upstream met its responsibility, but there was legitimately nothing to produce today," and it carries fingerprint: null. That structurally distinguishes "ran-empty" from "not-run" (where no record exists at all). I fix date to JST because handling it in UTC shifts the day boundary and can make yesterday's record look like today's. That is a spot I've been burned by before, so I always generate dates with an explicit timezone.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Why confusing 'did not run', 'ran but was legitimately empty', and 'ran and produced output' silently degrades an unattended pipeline, and a ledger design that keeps the three states apart

✦How to record upstream completion atomically and have downstream assert it before running, implemented in working TypeScript with fingerprint matching

✦The 'never trust the upstream' operating decisions I've settled on running multi-site automated publishing on a scheduler with no DAG

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Recording upstream completion atomically

Writing to the ledger has a pitfall specific to unattended operation. If the process dies mid-write, the ledger itself becomes broken JSON, and every read from the following day onward fails. To avoid the perverse outcome where the artifact meant to protect the pipeline is the one that breaks it, use an atomic write: write to a temp file, then swap it in with rename. POSIX guarantees that a rename within the same filesystem is atomic, so readers always see either the old, complete ledger or the new, complete one — never a half-written file.

// record.ts
import { createHash } from "node:crypto";
import { readFile, writeFile, rename, mkdtemp } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import type { RunRecord } from "./ledger";
 
export function fingerprintOf(content: string): string {
  return createHash("sha256").update(content).digest("hex").slice(0, 16);
}
 
// Generate JST YYYY-MM-DD explicitly (prevents UTC drift)
export function jstDate(now = new Date()): string {
  const jst = new Date(now.getTime() + 9 * 60 * 60 * 1000);
  return jst.toISOString().slice(0, 10);
}
 
export async function recordRun(
  ledgerPath: string,
  rec: RunRecord
): Promise<void> {
  let ledger: RunRecord[] = [];
  try {
    ledger = JSON.parse(await readFile(ledgerPath, "utf8"));
  } catch {
    ledger = []; // first run, or rebuild if it was corrupt
  }
 
  // Replace any existing task+date with the latest (idempotent on re-runs)
  const key = (r: RunRecord) => `${r.task}::${r.date}`;
  ledger = ledger.filter((r) => key(r) !== key(rec));
  ledger.push(rec);
 
  // Keep only the last 60 days so the ledger doesn't grow unbounded
  const cutoff = jstDate(new Date(Date.now() - 60 * 864e5));
  ledger = ledger.filter((r) => r.date >= cutoff);
 
  // Atomic swap: temp file -> rename
  const dir = await mkdtemp(join(tmpdir(), "ledger-"));
  const tmp = join(dir, "ledger.json");
  await writeFile(tmp, JSON.stringify(ledger, null, 2), "utf8");
  await rename(tmp, ledgerPath);
}

The upstream task calls recordRun at the very end of its processing, without fail. If it produced an artifact, it records status: "ok" with the content hash; if it was legitimately empty, status: "empty" with a reason; if it hit an unrecoverable failure, status: "error". The important part is recording not only success but also the "legitimate empty" explicitly. Without a record of the empty case, the downstream cannot tell it apart from "not-run."

The downstream verifies its preconditions with a barrier

Once the ledger is in place, the downstream task can explicitly verify, at the top of its processing, whether the upstream it depends on is in the expected state today. That is the dependency barrier. The point is not merely to check existence, but to return halt / skip / proceed according to the three states above.

// barrier.ts
import { readFile } from "node:fs/promises";
import type { RunRecord } from "./ledger";
import { jstDate, fingerprintOf } from "./record";
 
export type Gate =
  | { decision: "proceed"; upstream: RunRecord }
  | { decision: "skip"; reason: string }
  | { decision: "halt"; reason: string };
 
export async function checkUpstream(
  ledgerPath: string,
  upstreamTask: string,
  opts: { expectArtifact?: string } = {}
): Promise<Gate> {
  const today = jstDate();
  let ledger: RunRecord[] = [];
  try {
    ledger = JSON.parse(await readFile(ledgerPath, "utf8"));
  } catch {
    return { decision: "halt", reason: "cannot read ledger (corrupt or missing)" };
  }
 
  const rec = ledger.find((r) => r.task === upstreamTask && r.date === today);
 
  // 1) not-run — halt so we don't run on yesterday's leftovers
  if (!rec) {
    return { decision: "halt", reason: `${upstreamTask} has not completed today` };
  }
  // 2) failed — the upstream artifact can't be trusted, halt
  if (rec.status === "error") {
    return { decision: "halt", reason: `${upstreamTask} failed: ${rec.note ?? ""}` };
  }
  // 3) legitimate empty — not an error, skip quietly
  if (rec.status === "empty") {
    return { decision: "skip", reason: `${upstreamTask} was legitimately empty today` };
  }
 
  // 4) produced — verify the file downstream actually reads matches what upstream recorded
  if (opts.expectArtifact && rec.fingerprint) {
    const actual = fingerprintOf(await readFile(opts.expectArtifact, "utf8"));
    if (actual !== rec.fingerprint) {
      return {
        decision: "halt",
        reason: `artifact hash mismatch (suspected stale clone or partial write)`,
      };
    }
  }
  return { decision: "proceed", upstream: rec };
}

The fourth block in checkUpstream raises a plain "did it run today" check by a level. It matches what the upstream wrote to the ledger ("I produced a file with this hash") against the hash of the file the downstream actually reads from disk. That catches even environment drift — cases where the upstream succeeded but the downstream is looking at an old clone or a file still mid-sync. Because I operate across multiple machines and cloud sync, there really are moments when "what the upstream recorded" and "what the downstream sees on disk" diverge, and this hash match is the last safety net.

Wiring it into the downstream task, and the discipline of failing loud

The entry point of the downstream task is just this. When the decision is halt, it exits without producing degraded output, leaving a clear reason in the log. In unattended operation, "loudly stopping" is safer than "quietly doing nothing."

// downstream.ts
import { checkUpstream } from "./barrier";
 
const LEDGER = "/data/pipeline/ledger.json";
 
async function main() {
  const gate = await checkUpstream(LEDGER, "daily-reference-and-ticker", {
    expectArtifact: "/data/reference/claudelab.md",
  });
 
  if (gate.decision === "halt") {
    console.error(`[BARRIER] halt: ${gate.reason}`);
    process.exit(1); // record as a failure in the scheduler log
  }
  if (gate.decision === "skip") {
    console.log(`[BARRIER] skip: ${gate.reason}`);
    return; // exit normally, produce nothing
  }
 
  // We only reach here when the upstream's freshness is guaranteed
  console.log(`[BARRIER] proceed: upstream finished ${gate.upstream.finishedAt}`);
  await generateArticleFromReference();
}
 
async function generateArticleFromReference() {
  // the actual generation
}
 
main().catch((e) => {
  console.error(`[FATAL] ${e.message}`);
  process.exit(1);
});

Returning exit(1) on halt quietly pulls its weight. It lands clearly as a "failure" in the scheduler's run log, turning a silent degradation into a failure you can notice. Meanwhile skip exits normally with exit(0) and simply produces nothing. If you get this backwards and exit(1) every time the day is empty, alerts keep firing on perfectly normal quiet days, and like the boy who cried wolf, you start missing the real anomalies. This is the same "legitimate empty" from earlier, now defended at the level of exit codes.

Operating decisions on a scheduler with no DAG

Everything so far assumes the job scheduler has no real dependency-graph (DAG) feature. Ideally you'd declare dependencies to a workflow engine, but as an indie developer running automated publishing across the several sites of Dolice Labs, the operational cost of a dedicated engine often isn't worth it. I've run mine as a plain collection of scheduled tasks that fire at staggered times. That is exactly why externalizing dependencies as data (the ledger) and having each task verify its preconditions autonomously has been the realistic approach.

Three things proved worthwhile in practice. First, leave a generous gap between upstream and downstream start times. The barrier is a safety net, not a mechanism that waits for the upstream to finish. Second, consolidate the ledger into a single source of truth that every task reads and writes; scattering per-site ledgers hides cross-cutting dependencies. Third, keep the habit of skimming the halt logs each day. The number of times the barrier stopped things is a good gauge of upstream instability — when halted records start climbing, that's the signal to fix the upstream before touching the downstream.

I think of the dependency barrier as the part that gives an unattended pipeline the ability to doubt. Not trusting the upstream looks like a cold design, but it has been the most honest way to protect output quality from quiet degradation.

Where to start

You don't need to route every task through a ledger at once. Start with the single pair where an upstream failure hurts most — say, the daily reference-data refresh and the generation task that reads it — and add recordRun and checkUpstream to just that one pair. One line at the end of the upstream, a few lines at the top of the downstream, and the quietest incident of all — running on yesterday's leftovers — turns into a failure you can notice, starting that same day. Once that earns its keep, extending the chain of dependencies onto the ledger one link at a time is plenty.

Thank you for reading to the end. I'd be glad if this gives a starting point to anyone else wrestling with the quiet failures of unattended operation.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.