Architecture
TDD as Pipeline Architecture
Knights write failing tests. Warriors write minimal code to pass them. The pipeline enforces the split structurally.
What it is
The Knight writes failing tests (RED). The Warrior writes the minimal code to pass them (GREEN). The pipeline enforces this split — not as a convention, but as a structural constraint.
Where it lives
| File | Lines | Purpose |
|---|---|---|
workflow/standard.py | 37-89 | The canonical standard_build() factory |
agents/knight/ | — | "You define what 'correct' means before anyone writes a line of implementation." |
agents/warrior/ | — | "You receive a contract and make it pass with the minimum correct implementation." |
engine/gates.py | — | test_pass gate runs pytest, returns pass/fail |
How the pipeline enforces it
- Knight stage runs before Warrior stage (
depends_on=["knight"]instandard_build()) - Knight's gate is
completion— wrote some tests, they exist on disk - Warrior's gate is
test_pass— Knight's tests run green - Warrior has
max_iterations=3— three attempts to turn RED into GREEN before the pipeline escalates - The Warrior never modifies test files. This is a structural constraint enforced by the agent's toolset and its structural prompt — not a convention.
Why this is structurally different from "we recommend TDD"
| Framework | TDD stance | Enforcement |
|---|---|---|
| LangGraph | Recommends TDD | Nothing enforces it. Code-first developers skip it silently. |
| CrewAI | No TDD stance | N/A |
| Bonfire | TDD is architecture | Pipeline physically cannot advance Knight → Warrior without tests on disk. Cannot advance Warrior → Prover without test_pass green. |
The Knight-Warrior split is a protocol, not a practice. Warriors see Knight's test file contents as read-only context; modifying test files is a tool-surface-level denial.
The bounce-back loops
Two structural loops enforce quality after the Knight-Warrior handoff:
- Prover → Warrior — The Prover re-runs the tests independently. If a regression appears, the Warrior retries. The Prover is an independent auditor; the Warrior cannot skip it.
- Wizard → Warrior — The Wizard reviews the PR. On rejection (quality, style, architecture), the Warrior reworks. The PR earns "Wizard-approved" status only when the Wizard passes it.
Both loops have a max iteration count. After exhaustion, the pipeline escalates rather than looping forever.