I avoided LLM-based automated code review for a long time. The pattern I'd seen play out too many times: set up some AI reviewer, get 40 comments on a 20-line PR, spend more time dismissing noise than writing code.
Claude Code's GitHub PR trigger changed that experience enough to be worth writing about. The short version: CLAUDE.md lets you define what to look at and what to ignore, and that single mechanism makes the difference between useful and annoying. Here's the setup and what two weeks of real usage taught me.
What the PR Trigger Actually Does
The PR trigger runs Claude Code automatically when GitHub pull request events occur. Claude Code receives the diff, applies your review policy from CLAUDE.md, and posts review comments directly on the PR.
Supported trigger events:
pull_request— PR opened, synchronized (new push), or reopenedpull_request_review_comment— reply to an existing review commentissue_comment— any PR comment containing@claude
The @claude mention trigger is the one I use most. "Automatic comments on every PR" is noisier than it sounds in practice; "summon a review when I actually want one" fits real workflow better.
Setup
Prerequisites
- Owner or admin permissions on the target repository
ANTHROPIC_API_KEYfrom the Anthropic console- GitHub Actions enabled (the default for most repos)
Step 1: Add the API key as a secret
In your repository: Settings → Secrets and variables → Actions → New repository secret
Name it ANTHROPIC_API_KEY. If you use the GitHub CLI:
gh secret set ANTHROPIC_API_KEY --body "sk-ant-..."Step 2: Create the workflow file
Create .github/workflows/claude-review.yml:
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize, reopened]
issue_comment:
types: [created]
jobs:
claude-review:
# Restrict to @claude mentions if you want on-demand only
if: |
github.event_name == 'pull_request' ||
(github.event_name == 'issue_comment' &&
contains(github.event.comment.body, '@claude') &&
github.event.issue.pull_request)
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
issues: write
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0 # Required to access the full diff
- name: Run Claude Code Review
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
review_style: "constructive" # constructive | strict | minimal
max_comments: "5" # Omit for unlimitedThe max_comments parameter is a useful safety valve when you're starting out. Five comments per PR keeps the review focused without overwhelming the diff view.
Step 3: Write your CLAUDE.md review policy
This is the most important step — and the one most guides skip. A CLAUDE.md at the repository root tells Claude Code what to focus on and what to skip. Without it, you'll get comprehensive but exhausting reviews that flag formatting choices alongside actual bugs.
# CLAUDE.md — Code Review Policy
## Focus on these
- **Type safety**: Are TypeScript types defined correctly? Any implicit `any`?
- **Error handling**: Are exceptions caught and handled, not silently swallowed?
- **Performance**: N+1 queries, unnecessary re-renders, unoptimized loops
- **Security**: User input that reaches a database or external API without validation
## Skip these
- Comment style and JSDoc formatting (handled by our style guide)
- Indentation and whitespace (Prettier runs on every commit)
- Personal naming preferences (as long as names are clear, they're fine)
## Auto-approve without comment
- `package.json` version bumps only
- `.gitignore` additions
- Typo fixes in documentationThe two-column structure — "focus on this" and "skip this" — is the key pattern. I've found that the "skip" list matters at least as much as the "focus" list for keeping false positives low.
What the First Real Review Looked Like
After setup, I opened a PR adding a new Next.js API Route. Claude Code automatically reviewed it and left three comments:
Comment 1:
The
errorobject is logged directly to console, which may expose stack traces in production. Considererror instanceof Error ? error.message : 'Unknown error'before passing to the response.
Comment 2:
An async function is called without
awaiton line 34. If this is intentional fire-and-forget, a comment explaining the intent would prevent future confusion.
Comment 3:
The return type is typed as
any. Adding a specific type parameter or using zod validation here would make the API contract explicit.
All three were legitimate. None were style preferences or nitpicks. The CLAUDE.md exclusions had filtered out the noise I expected, and what remained was the category of feedback I actually want from code review — the things I should have caught before opening the PR.
Using @claude Mentions
Once the workflow is configured, you can summon a review at any point by mentioning @claude in a PR comment:
@claude Can you analyze the performance implications of this change?
I'm specifically wondering about the caching strategy interaction.
@claude Between approach A (lines 45-60) and approach B (lines 62-75),
which would you recommend and why?
This on-demand pattern feels more natural than automatic reviews for all PRs. Even on a solo project — where you're both the PR author and the reviewer — being deliberate about when you ask for a second opinion keeps the feedback loop tight.
Things I Learned After Two Weeks
Cost
API cost per review depends on diff size. Small PRs (under 200 lines changed) cost roughly $0.01–$0.03. For 50 PRs per month on a personal project, that's well under a few dollars. Worth tracking if you have a large team with high PR volume.
Review latency
Large diffs can take 30–60 seconds to process. Not a problem for code quality reviews (the latency is less than making coffee), but worth knowing if you're using this in a fast-moving CI pipeline where job times are tracked.
Tuning the false positive rate
In the first week, I got roughly two "this is intentional but Claude flagged it" comments per day. Each time, I added an entry to the CLAUDE.md "skip these" section. After about 10 additions over two weeks, the false positive rate dropped to near zero. It's an iterative calibration process — expect to spend 15 minutes in the first week updating the policy, then almost nothing after that.
Branch protections pair well with this
If you set up a branch protection rule that requires the Claude review job to pass before merging, you effectively make code review non-optional without adding reviewer overhead. For a solo project where "review" otherwise means "merge it and see," this creates a useful forcing function.
For more advanced patterns combining HTTP hooks with CI triggers, the Claude Code HTTP Hooks and CI/CD integration guide covers multi-step orchestration in detail.
Starting Small
The least-friction way to try this: set up the @claude mention trigger only (remove the pull_request event from the workflow), merge the workflow file, then open any PR and comment @claude anything here worth a second look?
You'll have a working automated reviewer running in about 10 minutes — and the on-demand format means you're in control of when it runs until you decide you want fully automatic reviews.
A Note on What Automated Review Can and Can't Replace
After two weeks, my honest assessment: Claude Code's PR trigger is a useful first-pass filter, not a replacement for human review on anything that matters.
Where it earns its keep is catching the category of mistakes that are obvious in retrospect but easy to miss when you're heads-down writing code. The async-without-await, the unguarded error propagation, the quietly widened type — these are the things a second pair of eyes would catch immediately, but a tired solo developer misses at 11pm. For that specific use case, having an automatic reviewer that runs silently and posts a few targeted comments is genuinely valuable.
What it doesn't replace: architectural judgment, knowledge of your specific codebase history, understanding of the tradeoffs that led to a particular design decision, and the kind of "wait, have we considered X?" questions that come from a human who's been living with the system. For those conversations, you still need people.
The most productive framing I've settled on: Claude Code's PR review is like a spell-checker for logic errors. It catches things that are clearly wrong by the rules of the language and your stated policy. It doesn't catch whether your approach is right for the problem.
That's enough to make it worth setting up — especially for solo projects where the alternative is no review at all.