◈ Cowork/2026-06-27Advanced

Logged as success, but it produced nothing — stopping silent failures in Cowork scheduled tasks with end-of-run assertions

A Cowork scheduled task exits 0, yet not a single artifact was produced. Trusting the exit code alone hides this silent failure. Here is how to turn your definition of done into end-of-run assertions that fail loudly with an evidence log.

cowork⁹ scheduled-task² automation⁷⁷ reliability⁹

✦ Premium Article

A task you believe is running on schedule keeps logging "SUCCESS" — but you open the output folder and not a single file has appeared since last week. As an indie developer running several sites under Dolice, if you run a handful of unattended Cowork scheduled tasks, you will hit this exact accident at least once. I did, one morning when I sat down to review a batch of logs and realized that a recurring job marked "success" for three days straight had not actually written a single line. My stomach dropped.

The cruel part is that nothing crashed. No exception was raised. The exit code was 0. The scheduler history was green. And yet the deliverable count was zero. This article is about catching that silent success: not by trusting the exit code, but by turning the definition of done itself into assertions that fail loudly.

An exit code does not promise that work happened

We unconsciously read "exit 0" as "it worked." But all an exit code guarantees is that the last command that ran returned 0 — not that the job you actually wanted got done. Those are two completely different propositions.

Silent failure tends to arrive by one of three routes.

Route	What happens	Why it stays exit 0
Wrong write target	You think you generated a file, but it landed in a stale temp path or a directory that doesn't exist	`cat > file` itself succeeds. The contents just aren't where you meant
Empty input	You mistype the path of an input file, and processing proceeds on an empty string	`cat wrong-path` doesn't error — it returns empty. Downstream "succeeds" on nothing
Commit that never landed	An unset git identity means the commit silently does nothing, and the push goes green with "up to date"	There's no diff to push, so the push itself counts as a success

What they share is that every individual command is honestly returning 0. Not one command is lying. And yet the whole thing failed. That is precisely why staring at exit codes from above will never reveal it.

The first time this bit me was a freshly cloned repo where I had forgotten to set the git identity. git commit printed a warning and effectively did nothing; git push came back with "Everything up-to-date." The scheduler history stayed a clean green for three days while the remote gained not one line.

Write your definition of done out loud

The first step toward killing silent failure is not writing code. It is stating, as concrete observable facts, what it means for this job to have succeeded. If that stays vague, you have no way to decide what to assert.

For a recurring job that "reflects generated output into a repo," the definition of done decomposes like this:

The output file actually exists and its size exceeds a floor
The expected count matches the real file count (for a JA/EN pair, both sides are even)
The local commit SHA has changed from its value before the push
The remote SHA and the local SHA match

The point is that each of these is a fact you can check from the outside, not a "should be done." Not "I committed" but "the SHA changed." Not "I wrote the file" but "there is a file of at least the floor size at that path." Once you can make that translation, the assertions write themselves.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Stop trusting exit 0 on automation that quietly produced nothing — you'll be able to detect that silent failure mechanically

✦You'll get a reusable harness that turns your definition of done into end-of-run assertions, writes an evidence log on failure, and exits non-zero

✦You'll learn to separate three nasty states — empty success, partial success, and double production — using idempotency keys and post-run checks

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Insert an end-of-run assertion harness

Once the definition of done is settled, place a single gate at the very end of the job that checks it mechanically. If even one condition is broken, refuse to write the success log and exit non-zero. That alone drags silent failure into the open.

#!/usr/bin/env bash
# verify_done.sh — verify a job's definition of done before it exits
# Usage: source at the end of the job, line up assert_* calls, close with finish_run
set -uo pipefail
 
EVIDENCE_LOG="${EVIDENCE_LOG:-/tmp/run_evidence_$(date +%s).log}"
FAILED=0
 
# Record failures instead of swallowing them. $1=condition, $2=observed value
fail() {
  FAILED=1
  printf '[FAIL] %s | %s\n' "$1" "$2" | tee -a "$EVIDENCE_LOG" >&2
}
pass() {
  printf '[ OK ] %s | %s\n' "$1" "$2" | tee -a "$EVIDENCE_LOG"
}
 
# Condition 1: file exists and exceeds a byte floor
assert_file_min() {
  local path="$1" min="${2:-200}"
  if [ ! -f "$path" ]; then
    fail "file_exists" "missing: $path"; return
  fi
  local size; size=$(wc -c < "$path")
  if [ "$size" -lt "$min" ]; then
    fail "file_min_size" "$path = ${size}B (< ${min}B)"; return
  fi
  pass "file_min_size" "$path = ${size}B"
}
 
# Condition 2: two counts match (e.g. a JA/EN pair)
assert_count_match() {
  local label="$1" a="$2" b="$3"
  if [ "$a" != "$b" ]; then
    fail "count_match:$label" "left=$a right=$b"; return
  fi
  pass "count_match:$label" "$a == $b"
}
 
# Condition 3: commit SHA advanced from its prior value
assert_sha_changed() {
  local before="$1" after="$2"
  if [ "$before" = "$after" ]; then
    fail "sha_changed" "commit did not advance: $after"; return
  fi
  pass "sha_changed" "$before -> $after"
}
 
# Close: if any FAIL occurred, exit non-zero
finish_run() {
  if [ "$FAILED" -ne 0 ]; then
    printf '\n🛑 Definition of done not met. Refusing to log SUCCESS.\n' >&2
    printf '   Evidence log: %s\n' "$EVIDENCE_LOG" >&2
    exit 1
  fi
  printf '\n✅ All conditions met.\n'
}

The caller looks like this. Capturing SHA_BEFORE ahead of the push is essential — without it you cannot decide afterward whether anything changed.

source verify_done.sh
 
OUT="content/articles/en/cowork/example.mdx"
SHA_BEFORE=$(git rev-parse HEAD)
 
# ... generate, commit, push here ...
 
SHA_AFTER=$(git rev-parse HEAD)
REMOTE_SHA=$(git rev-parse '@{u}')
 
assert_file_min "$OUT" 800
assert_count_match "ja_en" \
  "$(find content/articles/ja -name '*.mdx' | wc -l)" \
  "$(find content/articles/en -name '*.mdx' | wc -l)"
assert_sha_changed "$SHA_BEFORE" "$SHA_AFTER"
assert_count_match "local_remote_sha" "$SHA_AFTER" "$REMOTE_SHA"
 
finish_run   # success or failure is only decided here

What I like about this shape is that finish_run holds the right to write the success log. If anything upstream broke, execution simply never reaches the logging step. The worst path — the log marching forward on a job that only thinks it finished — is physically sealed off.

Why "don't swallow it" is the whole point

You might think set -e (exit on error) would handle this. But most silent failures never become errors, so set -e won't catch them. An empty cat, a diff-less push — both exit 0.

set -e protects you only when a command explicitly fails; it sails straight past "the command succeeded but produced nothing." So the direction of the fix cannot be "propagate the error." It has to be "actively confirm that the work was produced." It's a shift in where the burden of proof sits: you make the job prove it succeeded.

This echoes the verification-before-completion discipline in Anthropic's internal skills: before you declare you're done, produce evidence that you're done. The reason we always keep an evidence log is so a human can later trace why it went red. Silent failures are hard to reproduce, so capturing the observed values at the moment of failure is the most valuable asset you can leave yourself.

Separate empty, partial, and double production

If you take silent failure seriously, two neighboring states also need attention. End-of-run assertions catch "zero output," but operations need a finer distinction.

State	Symptom	Remedy
Empty success	exit 0 but no artifact	End-of-run assertions (this article)
Partial success	only part generated, then ran out of steam	Count-match assertion + cleanup of half-built artifacts
Double production	a retry creates the same artifact twice	An idempotency key that checks "did I already do this?" first

Double production is the trap that pairs badly with retries. Once you design things to go red on a failed assertion and re-run, you create duplicates in the case where "last time was actually half a success." To avoid that, put an idempotency check at the very start of the job.

import hashlib
import os
 
def idempotency_key(*parts: str) -> str:
    """Build a stable key from the input combination. Same input, same key."""
    raw = "\x1f".join(parts).encode("utf-8")
    return hashlib.sha256(raw).hexdigest()[:16]
 
def already_done(key: str, ledger: str = ".run_ledger") -> bool:
    """True if this input was completed before. One key per line in the ledger."""
    if not os.path.exists(ledger):
        return False
    with open(ledger, encoding="utf-8") as f:
        return any(line.strip() == key for line in f)
 
def mark_done(key: str, ledger: str = ".run_ledger") -> None:
    with open(ledger, "a", encoding="utf-8") as f:
        f.write(key + "\n")
 
# Usage: derive the key from inputs that uniquely identify the target
key = idempotency_key("cowork", "2026-06-27", "example-topic")
if already_done(key):
    print(f"⏭ Already completed: {key} — skipping")
else:
    # ... generation ...
    mark_done(key)   # write to the ledger only AFTER the conditions are met
    print(f"✅ Recorded completion: {key}")

Where you call mark_done matters. Write it not mid-generation but after the end-of-run assertions pass. That way a run that went red never lands in the ledger, so the next retry correctly redoes it. Write to the ledger at the start of generation instead, and a job that dies midway gets falsely recorded as "done" and is skipped forever — a brand new silent failure of its own making.

Don't trust the log itself too much

One last small operational habit that punches above its weight: always attach observed values like counts and SHAs to the success log. A log that says only "SUCCESS" gives a human no way to spot a silent failure after the fact.

[13:02 JST] cowork-example
  Status: SUCCESS
  Files: ja=684 en=684 (match)
  Commit: a1b2c3d -> e4f5g6h
  Remote: e4f5g6h (in sync)

With a log like this, even if some unknown route slips past the assertions, a human can notice the anomaly in the numbers. If Files: ja=683 en=683 shows the same figure for days on end, that itself is a sign of silent failure. A machine gate plus a human glance — two layers — feels very different in practice.

After switching to this design, the meaning of my morning habit of skimming the logs changed entirely. I used to glance and think "green, so we're fine." Now I look at whether the numbers are moving. Being green and the work actually progressing are, I have learned in my bones, two separate things.

Pick one of your own scheduled jobs and write out three observable facts that define "this has succeeded." Turning them into assertions can wait — that part comes easily afterward.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.