Bonfire
Architecture

TDD as Pipeline Architecture

Knights write failing tests. Warriors write minimal code to pass them. The pipeline enforces the split structurally.

What it is

The Knight writes failing tests (RED). The Warrior writes the minimal code to pass them (GREEN). The pipeline enforces this split — not as a convention, but as a structural constraint.

Where it lives

FileLinesPurpose
workflow/standard.py37-89The canonical standard_build() factory
agents/knight/"You define what 'correct' means before anyone writes a line of implementation."
agents/warrior/"You receive a contract and make it pass with the minimum correct implementation."
engine/gates.pytest_pass gate runs pytest, returns pass/fail

How the pipeline enforces it

  • Knight stage runs before Warrior stage (depends_on=["knight"] in standard_build())
  • Knight's gate is completion — wrote some tests, they exist on disk
  • Warrior's gate is test_pass — Knight's tests run green
  • Warrior has max_iterations=3 — three attempts to turn RED into GREEN before the pipeline escalates
  • The Warrior never modifies test files. This is a structural constraint enforced by the agent's toolset and its structural prompt — not a convention.

Why this is structurally different from "we recommend TDD"

FrameworkTDD stanceEnforcement
LangGraphRecommends TDDNothing enforces it. Code-first developers skip it silently.
CrewAINo TDD stanceN/A
BonfireTDD is architecturePipeline physically cannot advance Knight → Warrior without tests on disk. Cannot advance Warrior → Prover without test_pass green.

The Knight-Warrior split is a protocol, not a practice. Warriors see Knight's test file contents as read-only context; modifying test files is a tool-surface-level denial.

The bounce-back loops

Two structural loops enforce quality after the Knight-Warrior handoff:

  1. Prover → Warrior — The Prover re-runs the tests independently. If a regression appears, the Warrior retries. The Prover is an independent auditor; the Warrior cannot skip it.
  2. Wizard → Warrior — The Wizard reviews the PR. On rejection (quality, style, architecture), the Warrior reworks. The PR earns "Wizard-approved" status only when the Wizard passes it.

Both loops have a max iteration count. After exhaustion, the pipeline escalates rather than looping forever.

On this page