TDD as Pipeline Architecture

Knights write failing tests. Warriors write minimal code to pass them. The pipeline enforces the split structurally.

What it is

The Knight writes failing tests (RED). The Warrior writes the minimal code to pass them (GREEN). The pipeline enforces this split — not as a convention, but as a structural constraint.

Where it lives

File	Lines	Purpose
`workflow/standard.py`	37-89	The canonical `standard_build()` factory
`agents/knight/`	—	"You define what 'correct' means before anyone writes a line of implementation."
`agents/warrior/`	—	"You receive a contract and make it pass with the minimum correct implementation."
`engine/gates.py`	—	`test_pass` gate runs pytest, returns pass/fail

How the pipeline enforces it

Knight stage runs before Warrior stage (depends_on=["knight"] in standard_build())
Knight's gate is completion — wrote some tests, they exist on disk
Warrior's gate is test_pass — Knight's tests run green
Warrior has max_iterations=3 — three attempts to turn RED into GREEN before the pipeline escalates
The Warrior never modifies test files. This is a structural constraint enforced by the agent's toolset and its structural prompt — not a convention.

Framework	TDD stance	Enforcement
LangGraph	Recommends TDD	Nothing enforces it. Code-first developers skip it silently.
CrewAI	No TDD stance	N/A
Bonfire	TDD is architecture	Pipeline physically cannot advance Knight → Warrior without tests on disk. Cannot advance Warrior → Prover without `test_pass` green.

The Knight-Warrior split is a protocol, not a practice. Warriors see Knight's test file contents as read-only context; modifying test files is a tool-surface-level denial.

The bounce-back loops

Two structural loops enforce quality after the Knight-Warrior handoff:

Prover → Warrior — The Prover re-runs the tests independently. If a regression appears, the Warrior retries. The Prover is an independent auditor; the Warrior cannot skip it.
Wizard → Warrior — The Wizard reviews the PR. On rejection (quality, style, architecture), the Warrior reworks. The PR earns "Wizard-approved" status only when the Wizard passes it.

Both loops have a max iteration count. After exhaustion, the pipeline escalates rather than looping forever.

TDD as Pipeline Architecture

What it is

Where it lives

How the pipeline enforces it

Why this is structurally different from "we recommend TDD"

The bounce-back loops

On this page