Prompt Compilation

Priority-based context management with U-shape ordering. The one structural differentiator no competitor ships.

What it is

Every piece of context the framework hands to an agent is wrapped in a PromptBlock with an explicit priority level. When the compiled prompt would exceed the model's context window, the compiler drops the lowest-priority blocks first. The highest-priority block is never dropped.

Where it lives

File	Lines	Purpose
`prompt/compiler.py`	516	Block compiler + layout engine
`prompt/truncation.py`	161	Truncation algorithm
`prompt/axiom_meta.py`	—	Axiom metadata attached to each block
`prompt/templates/`	—	Jinja2 prompt templates per agent role

The algorithm

From truncation.py:truncate_blocks:

Sort blocks by priority ascending (lowest first = drop candidates).
While total estimated tokens exceed budget, drop from the front of the sorted list.
If the last surviving block still exceeds budget, character-slice its content.
Return survivors in their original insertion order.

U-shape ordering

From truncation.py:order_by_position:

The U-shape algorithm exploits the well-documented primacy/recency bias in transformer attention: models attend most to the beginning and end of context, with a "lost in the middle" valley. By placing the highest-priority block first and second-highest last, Bonfire maximizes attention on what matters.

Why no competitor ships this

As of April 2026:

Framework	Prompt management	Priority semantics
CrewAI	Raw templated context	None
LangGraph	`ChatPromptTemplate`, position-based	None
AutoGen	"Summarize if over context" fallback	None (lossy, unpredictable)
Semantic Kernel	`PromptTemplateConfig`, position-based	None
Bonfire	`PromptBlock` with explicit priority	Yes — enforced at block-construction seam

The gap is not "nobody thought of it." The gap is "nobody built the infrastructure to make priority semantics meaningful." Priority truncation only works if every consumer of PromptBlock tags priority at write time. Bonfire's axiom metadata system enforces this at the block-construction seam.

Known limitation

The estimate_tokens() function uses a chars/4 heuristic, not a real tokenizer. A 15% margin in effective_budget() compensates, but the sign of the error is not derivable from the heuristic alone — chars/4 over-estimates for code-heavy prompts nearly as often as it under-estimates.

When cost receipts are toggled on, this limitation is labeled honestly inside the receipt surface itself: tokens: chars/4 heuristic +15% margin. A future upgrade to real tokenizer counts is planned.