When Claude Declines a Request on Safety Grounds, What Should an Unattended Pipeline Return?

Reviewing the logs of a nightly job the next morning, I found one run whose ending looked like neither the usual error nor the usual artifact. No exception was raised. The HTTP status was 200. And yet the body held not the work I had asked for, but a polite refusal: "I can't help with that request." The pipeline read a returned response as success and wrote the refusal itself out as the deliverable.

Not an error, and not the completion I wanted. This safety-grounded decline is the third ending that unattended pipelines most easily drop. What a human beside the machine would spot at a glance slips silently past automation that only ever asks: success or failure?

In July 2026, Claude Fable 5 resumed worldwide availability, and with it came a new cybersecurity classifier meant to curb misuse of the top-tier model. Now that individuals routinely reach for frontier models, it has become genuinely more likely that legitimate work occasionally brushes against a safety judgment. That is exactly why it is worth treating a decline not as an exception, but as one of the normal branches your design should anticipate.

Two branches are not enough

Most pipelines treat the ending of a single call as a binary — success or failure. In reality you need to distinguish at least four states.

Ending	Typical signs	What to do
Normal completion	HTTP 200 / stop_reason is end_turn / body has the expected shape	Accept as the deliverable
Infrastructure failure	HTTP 429, 529, 5xx / timeout	Retry with exponential backoff
Safety decline	HTTP 200 but the body is a refusal unrelated to the ask / stop_reason is refusal	Do not retry; move to a human review queue
Degraded / empty	HTTP 200 but the body is empty or cut off mid-way	Fail loudly via a done-condition assertion

Of these four, the infrastructure and degraded cases were covered in Reading Claude API stop_reason Correctly — A Production Guide to end_turn, max_tokens, pause_turn, and refusal and Logged as success, but it produced nothing — stopping silent failures in Cowork scheduled tasks with end-of-run assertions. What I want to go one level deeper on here is the third one: the safety decline.

Don't conflate a decline with an infrastructure error

The awkward thing about a decline is that at the HTTP layer it looks like success. As long as you watch only the status code, a decline hides inside a "normal 200." To tell them apart you have to read three things together: the status, the stop_reason, and the body.

from dataclasses import dataclass
from enum import Enum
 
 
class Outcome(Enum):
    OK = "ok"                    # a normal completion you may accept
    INFRA_ERROR = "infra_error"  # a transient failure you may retry
    DECLINED = "declined"        # declined on safety grounds
    DEGRADED = "degraded"        # empty or truncated
 
 
@dataclass
class RunResult:
    http_status: int | None
    stop_reason: str | None
    text: str
 
 
def classify(result: RunResult) -> Outcome:
    # 1. Infrastructure first (failure at the network layer)
    if result.http_status is None or result.http_status >= 429:
        return Outcome.INFRA_ERROR
 
    # 2. The model returned an explicit refusal
    if result.stop_reason == "refusal":
        return Outcome.DECLINED
 
    # 3. 200 but the body is empty / extremely short = degraded
    if not result.text or len(result.text.strip()) < 40:
        return Outcome.DEGRADED
 
    # 4. Everything else is a normal completion
    return Outcome.OK

The key move is to catch refusal first and never drop it into the same bucket as an infrastructure error. Mix the two, and a machine will retry, over and over, something that actually needs human judgment. Retry is the remedy for a transient hiccup; it does nothing for a decline. Resend the same request under the same conditions and you get the same refusal back.

A decline that is not surfaced as an explicit stop_reason == refusal — a polite demurral inside the body — does happen in practice. If you want to catch it reliably while running unattended, verify a structural done-condition the deliverable must satisfy (for example, "contains at least two headings" or "has the required JSON keys") in addition to stop_reason, and hold any 200 that fails it as DEGRADED. Rather than having a machine adjudicate whether a body "means" a refusal, you stop on the observable fact that it "does not have the expected shape." That is a deliberate simplification, and a safe one.

The anti-pattern: auto-rewording a declined request to push it through

This is the single point I most want to land. When you receive a decline, it is tempting to wire in a remedy that mechanically rewords the prompt and resends until it goes through. Don't.

There are two reasons. First, if the decline was well-founded, engineering around it with automation runs against the intent of the safety design itself. What should be declined should stay declined. Second, even if it was a false positive on legitimate work, only a person who knows the context of that work can judge whether a rewording is warranted. Hand an unattended loop the job of "getting it through," and the most important step — confirming legitimacy — is exactly the one that goes missing.

Legitimate work does occasionally brush a judgment. The right response in that case is not to have a machine reword it, but for a person to re-issue the request with the concrete purpose and background attached. Return the decision to a human. The job of an unattended pipeline is not to push through, but to stop and hand off.

Move declines to a review queue

So how do you hand off after stopping? On a decline, move that run's input and output — in full — to a place a person can review later. Not accepted as a deliverable, not counted as success, but not lost either. This holding place is the review queue.

import json
import hashlib
from datetime import datetime, timezone
from pathlib import Path
 
REVIEW_DIR = Path("review_queue")
 
 
def enqueue_for_review(task_id: str, prompt: str, result: RunResult) -> Path:
    REVIEW_DIR.mkdir(exist_ok=True)
    # An idempotency key that folds repeated declines of the same input into one
    key = hashlib.sha256(f"{task_id}\n{prompt}".encode()).hexdigest()[:16]
    path = REVIEW_DIR / f"{key}.json"
 
    record = {
        "task_id": task_id,
        "queued_at": datetime.now(timezone.utc).isoformat(),
        "outcome": "declined",
        "prompt": prompt,
        "response_text": result.text,
        "stop_reason": result.stop_reason,
    }
    path.write_text(json.dumps(record, ensure_ascii=False, indent=2))
    return path
 
 
def handle(task_id: str, prompt: str, result: RunResult) -> str:
    outcome = classify(result)
    if outcome is Outcome.DECLINED:
        enqueue_for_review(task_id, prompt, result)
        return "held"          # neither success nor failure, but "held"
    if outcome is Outcome.INFRA_ERROR:
        raise TransientError(result.http_status)  # send to retry
    if outcome is Outcome.DEGRADED:
        raise DefinitionOfDoneError(task_id)      # fail on done-condition
    return "ok"
 
 
class TransientError(Exception): ...
class DefinitionOfDoneError(Exception): ...

Folding repeated declines of the same input into one record with an idempotency key keeps the queue from bloating night after night with identical entries. Review the queue once a week and you begin to see patterns in what gets declined. If a particular topic keeps brushing the judgment, that is a signal to handle that area by hand rather than unattended.

Another benefit of this design is that giving the pipeline a third return value — "held" — makes its logs honest. Don't count declines as successes, and your success rate reflects reality; don't count them as failures, and you stop burning retries and alerts on them needlessly. The instinct to give a long unattended run a ceiling and a place to set things aside is continuous with Putting a Ceiling on the pause_turn Loop: Running Long Server Tools Safely Unattended.

Wrapping up, and a next step

A safety decline is not a fault for an unattended pipeline; it is one of the normal branches to plan for. Add a third state — "held" — to your two-way success/failure check, classify declines apart from infrastructure errors, never auto-reword to push through, and hand off to a review queue that returns the decision to a person. Wire in just these four things, and an ending that used to slip by silently will be plainly in front of you the next morning.

Start by picking a single scheduled job that matters most to you, and slip a classify-style check into one call. If it catches even one decline as "held," the mechanism is worth spreading to your other tasks. Running several sites unattended as an indie developer under Dolice, I found that once I had this holding place in place, I could hand the nightly work over and rest easier. Thank you for reading.