Skip to content

@tank/bdd-e2e-testing

1.1.0

BDD end-to-end testing against real systems. Covers web apps (Playwright), libraries (pytest-bdd + Docker), APIs, CLIs, message queues. Gherkin writing, step definitions, Page Objects, Screenplay, 3-layer architecture, CI/CD, multi-language (TypeScript, Python, Java, .NET). Triggers: BDD test, Gherkin, Cucumber, feature file, Given When Then, playwright-bdd, pytest-bdd, Behave, Cucumber-JVM, Serenity BDD, Reqnroll, Example Mapping, Three Amigos, living documentation, BDD setup, BDD architecture.


name: "@tank/bdd-e2e-testing" description: "BDD end-to-end testing against real systems. Covers web apps (Playwright), libraries (pytest-bdd + Docker), APIs, CLIs, message queues. Gherkin writing, step definitions, Page Objects, Screenplay, 3-layer architecture, CI/CD, multi-language (TypeScript, Python, Java, .NET). Triggers: BDD test, Gherkin, Cucumber, feature file, Given When Then, playwright-bdd, pytest-bdd, Behave, Cucumber-JVM, Serenity BDD, Reqnroll, Example Mapping, Three Amigos, living documentation, BDD setup, BDD architecture."

BDD E2E Testing

Hard Rules

These are non-negotiable. Violating any of these means the work is wrong.

  1. ALL tests go in .bdd/ at project root. Never tests/, e2e/, __tests__/, or any other location. The .bdd/ directory is the single source of truth for all BDD test artifacts.

  2. ZERO mocks. ZERO stubs. ZERO fakes. Every scenario runs against the REAL system with REAL dependencies. For web apps: real browser, real backend, real DB — never page.route(), MSW, or nock. For libraries: real infrastructure in Docker (real RabbitMQ, real Redis, real DB) — never mock the transport or protocol layer. The only exception: third-party services you cannot control (payment gateways, SMS providers).

  3. Code is guilty until proven innocent. When a test fails, the APPLICATION code is wrong — not the test. Fix the application, document the fix in qa/resolutions/, then re-verify. Never weaken a test to make it pass.

  4. Document everything. Every test run produces findings. Every fix produces a resolution. The .bdd/qa/ directory is the audit trail.

Mandatory Directory Structure

.bdd/
  features/              # Gherkin specs (what SHOULD work)
    auth/
      login.feature
  steps/                 # Step definitions (by domain, NOT by feature)
    auth.steps.ts
  interactions/          # Interaction layer (adapts by archetype)
  support/               # Fixtures, hooks, config
    fixtures.ts
    hooks.ts
  qa/
    findings/            # What happened when tests ran
    resolutions/         # What was changed to fix failures

The interactions/ directory adapts to what you're testing:

Archetypeinteractions/ containsExamples
Web appPage Objectslogin.page.ts, checkout.page.ts
Library/packageTest helpers + fixturespika_app.py, docker_fixtures.py
REST APIAPI clientsorders.client.ts, auth.client.ts
CLI toolCommand runnerscli.runner.ts, output.parser.ts
Message queueBroker helpersrabbit.helper.py, test_consumers.py

Create this structure FIRST before writing any test code.

Core Philosophy

  1. BDD is collaboration first, automation second — Feature files are shared artifacts born from Discovery workshops, not test scripts written by developers alone.

  2. Real E2E means zero mocks — Every scenario runs against the actual system with real dependencies. The "end user" varies by archetype: a person in a browser, a developer consuming a library API, an operator running CLI commands. If any dependency is faked, it is not E2E.

  3. Declarative over imperative — Scenarios describe WHAT the system does, not HOW the user clicks. "Given Emma has items in her cart" beats "Given the user clicks the add button 3 times".

Testing Archetypes

The "end user" is whoever consumes your software. BDD describes behavior from THEIR perspective against REAL infrastructure.

What you testEnd userBDD tool"Real E2E" means
Web appPerson in a browserplaywright-bdd / PlaywrightReal browser, real backend, real DB
Library/packageDeveloper consuming the APIpytest-bdd / BehaveReal deps in Docker, zero mocks
REST APIAPI consumer / frontendPlaywright API / SupertestReal HTTP, real DB
CLI toolPerson running commandssubprocess / pytest-bddReal filesystem, real processes
Message queueService developerpytest-bdd + real brokerReal RabbitMQ/Kafka in Docker

Verification Workflow

This is the mandatory sequence. Do not skip steps.

StepActionOutput
1. Write featuresDescribe expected behavior in Gherkin.bdd/features/*.feature
2. Implement stepsWire Given/When/Then to real system interactions.bdd/steps/*.steps.{ts,py}
3. Run against real systemExecute with real dependencies, zero mocksTest results
4. Document findingsRecord what passed, what failed, evidence.bdd/qa/findings/*.md
5. Fix the codeChange APPLICATION code, never weaken testsSource code changes
6. Document resolutionRecord what changed, why, verification result.bdd/qa/resolutions/*.md
7. Re-run and verifyConfirm the fix works, update findingsUpdated findings

See references/qa-workflow.md for findings/resolution format and examples.

Quick-Start

"I need to set up BDD E2E tests for a web app"

StepAction
1. Create .bdd/Create the mandatory directory structure above
2. Choose frameworkNew project: playwright-bdd. Existing Cucumber: @cucumber/cucumber + Playwright. See references/playwright-bdd-setup.md
3. Write first featureStart with a smoke test. Declarative style. See references/gherkin-writing.md
4. Implement stepsWire to Playwright via Page Objects. See references/step-definitions.md
5. Run and documentRun tests, document findings. See references/qa-workflow.md
6. Set up CIGitHub Actions pipeline. See references/test-architecture.md

"I need to test a library, CLI tool, or service (no browser)"

StepAction
1. Create .bdd/Same structure, but interactions/ holds test helpers and Docker fixtures instead of Page Objects
2. Choose frameworkPython: pytest-bdd. TypeScript: playwright-bdd with request fixture (no browser). See references/multi-language-frameworks.md
3. Write first featureDescribe behavior from the developer/operator perspective. See references/gherkin-writing.md
4. Implement stepsWire to real infrastructure via interaction layer. See references/step-definitions.md
5. Run and documentRun against real deps (Docker), document findings. See references/qa-workflow.md

"My BDD tests are brittle and hard to maintain"

SymptomFixReference
Steps break when UI/API changesMove details into interaction layer (Page Objects, helpers)references/step-definitions.md
Feature files read like codeRewrite declaratively, raise abstractionreferences/gherkin-writing.md
Steps can't be reusedOrganize by domain, use shared step librariesreferences/step-definitions.md
Scenarios take too longBypass UI for setup (API seeding), parallelizereferences/test-architecture.md
Agent mocks dependenciesHARD RULE VIOLATION. Remove all mocks, use real services/infrareferences/step-definitions.md

Decision Trees

Which framework?

SignalRecommendation
TypeScript web appplaywright-bdd — native Playwright runner, fixtures, parallel
TypeScript API-only (no browser)playwright-bdd with request fixture — same DX, no browser overhead
Existing Cucumber.js ecosystem@cucumber/cucumber + Playwright — keep existing features/steps
Python web apppytest-bdd + Playwright
Python library/service/CLIpytest-bdd + Docker fixtures (Testcontainers or docker-compose)
Java projectCucumber-JVM or Serenity BDD
.NET projectReqnroll (SpecFlow successor)

When does BDD add value?

SignalBDD worth it?
Non-technical stakeholders review scenariosYes — shared language pays off
Three Amigos workshops happen regularlyYes — Discovery drives quality
Solo developer, no collaborationNo — overhead without collaboration benefit
Rapidly changing UI, stable business rulesYes — declarative scenarios survive UI rewrites
Library/API with complex behavior contractsYes — Gherkin documents the contract from the consumer's perspective

Reference Files

FileContents
references/qa-workflow.mdVerification workflow, findings format, resolution format, re-verification loop, examples
references/playwright-bdd-setup.mdplaywright-bdd setup with .bdd/ structure, createBdd(), defineBddConfig, bddgen, fixtures, hooks, complete working example
references/cucumber-playwright-traditional.mdTraditional @cucumber/cucumber + Playwright setup with .bdd/ structure, Custom World, hooks, formatters, complete example
references/gherkin-writing.mdDeclarative Gherkin craft, good/bad pairs, Scenario Outline, Background, tags, Data Tables, anti-patterns
references/step-definitions.mdStep patterns, Page Object Model, Screenplay, action classes, no-mocking rules, assertion patterns
references/bdd-collaboration.mdThree Amigos, Example Mapping, Feature Mapping, living documentation
references/test-architecture.md3-layer architecture, .bdd/ project structure, CI/CD pipeline, reporting, parallel execution
references/multi-language-frameworks.mdTypeScript, Python, Java, .NET framework selection and migration

Command Palette

Search skills, docs, and navigate Tank