Articles/Claude Code

⟐ Claude Code/2026-05-05Advanced

BDD with Claude Code in Production— From Gherkin Scenario Generation to Cross-Team Test Culture

A production-ready guide to Behavior-Driven Development with Claude Code. Learn how to auto-generate Gherkin scenarios, implement step definitions, integrate with Playwright/Cucumber, and build a cross-team test culture — all with working code examples.

Claude Code²⁰³ BDD Testing² Gherkin Playwright Quality Assurance Automation⁴⁰

✦ Premium Article

Have you ever tried introducing BDD (Behavior-Driven Development), only to hit a wall — scenarios you couldn't write, an explosion of step definitions to maintain, and a framework that only engineers ended up touching?

I've been there. Across multiple projects, I've attempted and then abandoned BDD. Learning Gherkin syntax was manageable, but writing scenarios that genuinely reflected business value from scratch was harder than expected, and the maintenance cost never felt worth it.

That changed when I started using Claude Code seriously. When you delegate scenario generation, step definitions, and test code automation to Claude Code, BDD transforms from something you write to something you cultivate. This guide walks you through that implementation with working, production-tested code.

What BDD Is — and Why Claude Code Changes Everything

BDD (Behavior-Driven Development) is a development methodology that describes application behavior in natural language, then uses that language as the specification for test code. Where TDD verifies code correctness, BDD documents business intent.

Using a DSL called Gherkin, scenarios look like this:

Feature: User Login
  Value: Only authenticated users can access the dashboard
 
  Scenario: Login succeeds with valid credentials
    Given the user has a registered account
    When they enter email "test@example.com" and password "SecurePass123"
    And they click the Login button
    Then they are redirected to the dashboard
    And the message "Welcome, Test User" is displayed
 
  Scenario: Login fails with incorrect password
    Given the user has a registered account
    When they enter email "test@example.com" and the wrong password "WrongPass"
    And they click the Login button
    Then the error message "Incorrect email or password" is displayed
    And they are not redirected to the dashboard

Writing these scenarios by hand becomes impractical as features grow in complexity. With Claude Code, you can auto-generate scenarios from requirements documents or user stories, and then automate the step definitions as well.

Project Setup

Installing Required Packages

Set up a BDD environment in your Next.js project:

# Build a BDD environment with Playwright + Cucumber.js
npm install --save-dev \
  @cucumber/cucumber \
  @playwright/test \
  playwright \
  @types/node
 
# Install Playwright browsers
npx playwright install chromium

Organize your project structure like this:

project-root/
├── features/                    # Gherkin scenario files
│   ├── auth/
│   │   └── login.feature
│   ├── dashboard/
│   │   └── overview.feature
│   └── support/
│       └── world.ts             # Cucumber World setup
├── steps/                       # Step definitions
│   ├── auth/
│   │   └── login.steps.ts
│   └── common/
│       └── navigation.steps.ts
└── cucumber.config.ts           # Cucumber configuration

Basic cucumber.config.ts configuration:

// cucumber.config.ts
import { defineConfig } from '@cucumber/cucumber';
 
export default defineConfig({
  default: {
    requireModule: ['ts-node/register'],
    require: ['steps/**/*.ts', 'features/support/**/*.ts'],
    format: [
      'progress-bar',
      'json:reports/cucumber-report.json',
      'html:reports/cucumber-report.html'
    ],
    formatOptions: { snippetInterface: 'async-await' },
    worldParameters: {
      baseUrl: process.env.BASE_URL || 'http://localhost:3000',
    },
    timeout: 30000,
    retry: 1,
  },
});

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Developers stuck on BDD scenario design will be able to auto-generate Gherkin with Claude Code and start integrating it into real projects today

✦You'll gain a production-ready quality assurance system that integrates step definitions, E2E tests, and CI/CD pipelines in a single workflow

✦You'll learn how to design a test culture where non-engineers (POs, QA, designers) can actively participate — plus real team operating patterns

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Auto-Generating Gherkin Scenarios with Claude Code

This is where Claude Code really earns its keep. Pass requirements documents or user stories to Claude Code, and it generates BDD-appropriate scenarios automatically.

Writing BDD Guidelines in CLAUDE.md

<!-- .claude/CLAUDE.md -->
## BDD Scenario Generation Guidelines
 
When implementing new features or modifying existing ones,
always create BDD scenarios first using the following process.
 
### Gherkin Writing Rules
- Feature descriptions must include a one-sentence "underlying value"
- Scenario titles should make the condition and outcome immediately clear
- Each Given/When/Then step should describe only one action per line
- Use data tables and Examples to keep scenarios concise
- Use business terminology — never include technical implementation details
 
### Generation Commands
- `bdd generate <feature-name>` to generate scenarios
- `bdd steps <feature-file>` to generate step definitions
- `bdd review` for quality review of existing scenarios

A sample Claude Code session for scenario generation:

# Example Claude Code interaction
You: Generate BDD scenarios for a shopping cart feature.
     Requirements: add/remove items, change quantity, calculate total price.

Claude Code: [Generates features/cart/cart.feature]

Feature: Shopping Cart
  Value: Users can temporarily store desired products and review the total price

  Background:
    Given the user is logged in
    And product "Wireless Earbuds Pro" is in stock with quantity 10 at price $128.00
    And product "USB Cable" is in stock with quantity 50 at price $8.80

  Scenario: Adding a product to the cart
    When the user opens the product page for "Wireless Earbuds Pro"
    And they click the "Add to Cart" button
    Then the cart icon shows the number "1"
    And opening the cart page shows 1 "Wireless Earbuds Pro"
    And the cart total shows $128.00

  Scenario: Changing the quantity of a cart item
    Given "Wireless Earbuds Pro" is in the cart with quantity 1
    When they change the quantity to "3"
    Then the cart total shows $384.00

  Scenario Outline: Adding multiple products to the cart
    Given the cart is empty
    When they add the following products to the cart:
      | Product Name            | Qty    |
      | Wireless Earbuds Pro    | <qty1> |
      | USB Cable               | <qty2> |
    Then the cart total shows $<expected>

    Examples:
      | qty1 | qty2 | expected |
      | 1    | 1    | 136.80   |
      | 2    | 3    | 282.40   |
      | 0    | 5    | 44.00    |

  Scenario: Error is shown when quantity exceeds stock
    Given "Wireless Earbuds Pro" is in the cart with quantity 9
    When they try to change the quantity to "11"
    Then the error message "Cannot exceed available stock (Available: 10)" is displayed
    And the cart quantity remains at 9

Writing exhaustive scenarios like these manually is extremely difficult, but when you delegate to Claude Code, edge cases you'd typically overlook — insufficient stock, zero quantities, etc. — are naturally included.

Auto-Generating Step Definitions

Once scenarios are in place, step definitions come next. Pass the .feature file to Claude Code and it generates TypeScript step definitions automatically.

// steps/cart/cart.steps.ts
// Auto-generated step definitions by Claude Code (with error handling)
 
import { Given, When, Then } from '@cucumber/cucumber';
import { expect } from '@playwright/test';
import type { CustomWorld } from '../support/world';
 
Given('the cart is empty', async function(this: CustomWorld) {
  // Clear the cart via API
  const response = await this.page.request.post('/api/cart/clear', {
    headers: { Authorization: `Bearer ${this.authToken}` }
  });
  
  if (!response.ok()) {
    throw new Error(`Failed to clear cart: ${response.status()} ${await response.text()}`);
  }
  
  // Navigate to cart page and verify it's empty
  await this.page.goto('/cart');
  await expect(this.page.getByTestId('cart-empty-message')).toBeVisible();
});
 
When('the user opens the product page for {string}', async function(this: CustomWorld, productName: string) {
  await this.page.goto('/products');
  
  const productLink = this.page.getByRole('link', { name: productName });
  await expect(productLink).toBeVisible({ timeout: 5000 });
  await productLink.click();
  
  await this.page.waitForLoadState('networkidle');
  this.currentProductName = productName;
});
 
When('they click the {string} button', async function(this: CustomWorld, buttonText: string) {
  const button = this.page.getByRole('button', { name: buttonText });
  
  await expect(button).toBeEnabled({ timeout: 3000 });
  await button.click();
  
  // Wait for the cart API response
  await this.page.waitForResponse(
    response => response.url().includes('/api/cart') && response.status() === 200,
    { timeout: 5000 }
  );
});
 
Then('the cart icon shows the number {string}', async function(this: CustomWorld, count: string) {
  const cartBadge = this.page.getByTestId('cart-badge');
  await expect(cartBadge).toBeVisible();
  await expect(cartBadge).toHaveText(count);
});
 
When('they add the following products to the cart:', async function(this: CustomWorld, dataTable: any) {
  const items = dataTable.hashes();
  
  for (const item of items) {
    const qty = parseInt(item['Qty']);
    if (qty === 0) continue;
    
    await this.page.goto('/products');
    const productLink = this.page.getByRole('link', { name: item['Product Name'] });
    await productLink.click();
    await this.page.waitForLoadState('networkidle');
    
    if (qty > 1) {
      const qtyInput = this.page.getByRole('spinbutton', { name: 'Quantity' });
      await qtyInput.fill(String(qty));
    }
    
    await this.page.getByRole('button', { name: 'Add to Cart' }).click();
    await this.page.waitForResponse(
      response => response.url().includes('/api/cart') && response.status() === 200,
      { timeout: 5000 }
    );
  }
});
 
Then('the cart total shows ${int}', async function(this: CustomWorld, expectedPrice: number) {
  await this.page.goto('/cart');
  
  const totalElement = this.page.getByTestId('cart-total-price');
  await expect(totalElement).toBeVisible();
  
  const priceText = await totalElement.textContent();
  const actualPrice = parseFloat(priceText?.replace(/[^0-9.]/g, '') || '0');
  
  expect(actualPrice).toBeCloseTo(expectedPrice, 2);
});

One thing commonly overlooked in step definitions is error handling. When you let Claude Code generate them, it naturally includes waitForResponse, timeout settings, and explicit error messages — the parts I tend to skip when writing by hand. Claude Code implements them faithfully.

Designing the World Object

Cucumber's World is the mechanism for sharing state between steps. Designing it well dramatically improves test readability and reusability.

// features/support/world.ts
import { World, IWorldOptions, setWorldConstructor } from '@cucumber/cucumber';
import { Browser, BrowserContext, Page, chromium } from '@playwright/test';
 
export interface CustomWorld extends World {
  browser: Browser;
  context: BrowserContext;
  page: Page;
  authToken: string;
  currentProductName?: string;
  testData: Record<string, any>;
}
 
class PlaywrightWorld extends World implements CustomWorld {
  browser!: Browser;
  context!: BrowserContext;
  page!: Page;
  authToken: string = '';
  currentProductName?: string;
  testData: Record<string, any> = {};
 
  constructor(options: IWorldOptions) {
    super(options);
  }
}
 
import { Before, After, BeforeAll, AfterAll } from '@cucumber/cucumber';
 
let sharedBrowser: Browser;
 
BeforeAll(async function() {
  sharedBrowser = await chromium.launch({
    headless: process.env.HEADED !== 'true',
    slowMo: process.env.SLOW_MO ? parseInt(process.env.SLOW_MO) : 0,
  });
});
 
AfterAll(async function() {
  await sharedBrowser?.close();
});
 
Before(async function(this: CustomWorld) {
  // Create an isolated context per test (cookies and storage are separate)
  this.context = await sharedBrowser.newContext({
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
    viewport: { width: 1280, height: 720 },
    locale: 'en-US',
  });
  this.page = await this.context.newPage();
  
  // Log browser console errors
  this.page.on('console', (msg) => {
    if (msg.type() === 'error') {
      console.error(`Browser Console Error: ${msg.text()}`);
    }
  });
});
 
After(async function(this: CustomWorld, scenario) {
  // Save screenshot on test failure
  if (scenario.result?.status === 'FAILED') {
    const screenshotPath = `reports/screenshots/${scenario.pickle.name.replace(/\s/g, '_')}.png`;
    await this.page.screenshot({ path: screenshotPath, fullPage: true });
    this.attach(await this.page.screenshot(), 'image/png');
  }
  
  await this.context?.close();
});
 
setWorldConstructor(PlaywrightWorld);

Integrating with CI/CD Pipelines

Here's a GitHub Actions setup for running BDD tests automatically:

# .github/workflows/bdd-tests.yml
name: BDD Tests
 
on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
 
jobs:
  bdd-test:
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: testpassword
          POSTGRES_DB: test_db
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
 
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Install Playwright browsers
        run: npx playwright install chromium
      
      - name: Build application
        run: npm run build
        env:
          DATABASE_URL: postgresql://postgres:testpassword@localhost:5432/test_db
      
      - name: Run database migrations
        run: npm run db:migrate
        env:
          DATABASE_URL: postgresql://postgres:testpassword@localhost:5432/test_db
      
      - name: Start application
        run: npm start &
        env:
          DATABASE_URL: postgresql://postgres:testpassword@localhost:5432/test_db
          PORT: 3000
      
      - name: Wait for application startup
        run: npx wait-on http://localhost:3000 --timeout 30000
      
      - name: Run BDD tests
        run: npx cucumber-js
        env:
          BASE_URL: http://localhost:3000
      
      - name: Upload test reports
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: cucumber-reports
          path: reports/
          retention-days: 14
      
      - name: Upload failure screenshots
        uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: failure-screenshots
          path: reports/screenshots/

Common Pitfalls and How to Handle Them

Here are three places where BDD adoption typically goes wrong.

1. Scenarios become too technical

# ❌ Too technical (avoid this)
When the user sends a POST request to /api/auth/login
And the response returns 200 OK with a JWT token
 
# ✅ Business-oriented scenario
When the user enters valid credentials and clicks Login
Then the dashboard is displayed

Claude Code can generate implementation-heavy scenarios if you don't explicitly tell it to use a business perspective. Make your writing rules clear in CLAUDE.md.

2. Background sections become bloated

When the Background section contains 10+ lines of preconditions, readability drops sharply. Asking Claude Code to "refactor this Background to three lines or fewer" typically produces a proposal to extract test data into fixtures.

3. Inconsistent step granularity

Writing "Login" as one step in some scenarios and three steps in others makes maintenance painful. Organizing shared steps in a common/ folder and periodically asking Claude Code to "identify what can be extracted as shared steps" is an effective practice.

4. Dealing with flaky tests

In modern web apps with heavy async processing, timing-dependent test failures are common. Here's the pattern Claude Code recommends:

// ❌ Prone to flakiness
await this.page.click('[data-testid="submit-button"]');
await this.page.waitForTimeout(1000); // Fixed waits are fragile
 
// ✅ Event-driven waiting
await Promise.all([
  this.page.waitForResponse(
    response => response.url().includes('/api/') && response.status() < 400
  ),
  this.page.click('[data-testid="submit-button"]')
]);
 
// ✅ Network idle state
await this.page.click('[data-testid="submit-button"]');
await this.page.waitForLoadState('networkidle', { timeout: 10000 });

Enabling Non-Engineers to Participate

The original purpose of BDD is for engineers and business stakeholders to share the same scenarios as a common language. Here's a workflow that enables non-engineers using Claude Code.

Scenario Creation Flow for POs and Product Teams

Have them write requirements in bullet points (natural language is fine)
Use Claude Code to convert to Gherkin
Have the PO review and revise (they shouldn't need to touch the technical parts)
Engineers implement the step definitions

# Requirements written by PO (natural language)
"Users should be able to change their profile photo.
Only JPEG and PNG formats. Under 5MB. Changes reflected immediately."

# Gherkin generated by Claude Code
Feature: Profile Photo Update

  Scenario: Profile photo changes successfully with a valid image
    Given the user is on the profile settings page
    When they upload a 2MB JPEG file
    Then the profile photo updates to the new image
    And the change is immediately reflected in the header icon

  Scenario: Images over 5MB cannot be uploaded
    Given the user is on the profile settings page
    When they try to upload a 6MB PNG file
    Then the error message "File size must be 5MB or less" is displayed
    And the profile photo is not changed

  Scenario: Unsupported file formats cannot be uploaded
    Given the user is on the profile settings page
    When they try to upload a GIF file
    Then the error message "Please select a JPEG or PNG file" is displayed

The key here is maintaining a state where POs can read and modify scenarios. Keeping implementation details out of scenarios — which you can explicitly instruct Claude Code to do — makes it much easier for non-engineers to understand what's being tested.

Periodic Quality Reviews with Claude Code

Running periodic quality checks on existing scenarios prevents drift and bloat:

# Example Claude Code review command
# .claude/commands/bdd-review.md
 
## BDD Scenario Quality Review
 
Please review the entire features/ directory with the following criteria:
 
1. Do any scenarios contain technical implementation details?
2. Are any Background sections overloaded with preconditions?
3. Are there repeated steps across multiple scenarios? (consolidation opportunity)
4. Are there missing edge cases? (boundary values, error cases)
5. Are scenario names written in the "condition → outcome" format?
6. Is any single scenario testing more than one behavior? (single responsibility)
 
Present improvement suggestions with rewritten scenario examples.

Long-Term Tips for Sustaining BDD

A few things that help BDD stick over time.

Manage test execution time: As scenarios multiply, execution time grows. Tagging scenarios with @smoke, @regression, and @critical — then running only smoke tests per PR and full regression on a schedule — is a practical approach.

@smoke @auth
Scenario: Login succeeds with valid credentials
  ...
 
@regression @cart
Scenario: Error is shown when quantity exceeds stock
  ...

Don't let scenario debt accumulate: Deferring scenario updates during feature changes quickly creates drift from reality. Including "update related scenarios" as a PR merge requirement keeps scenarios accurate.

Set performance targets: In my projects, I aim for smoke tests to finish within 3 minutes and full regression within 20 minutes. When those thresholds look threatened, it's the signal to revisit parallel execution settings.

A Note from an Indie Developer

Your First Step

BDD isn't about "building the framework and calling it done" — it's something a team grows together over time. The best starting point is writing a single scenario for the feature you're currently working on. Just ask Claude Code "please write a Gherkin scenario for this feature" and you'll get a solid result to start from.

Step definitions, CI integration — all of that can come later. The first step is simply getting a scenario to exist. Five minutes is enough to get started. Give it a try in your current project today.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.