Bonfire
Architecture

Prompt Compilation

Priority-based context management with U-shape ordering. The one structural differentiator no competitor ships.

What it is

Every piece of context the framework hands to an agent is wrapped in a PromptBlock with an explicit priority level. When the compiled prompt would exceed the model's context window, the compiler drops the lowest-priority blocks first. The highest-priority block is never dropped.

Where it lives

FileLinesPurpose
prompt/compiler.py516Block compiler + layout engine
prompt/truncation.py161Truncation algorithm
prompt/axiom_meta.pyAxiom metadata attached to each block
prompt/templates/Jinja2 prompt templates per agent role

The algorithm

From truncation.py:truncate_blocks:

  1. Sort blocks by priority ascending (lowest first = drop candidates).
  2. While total estimated tokens exceed budget, drop from the front of the sorted list.
  3. If the last surviving block still exceeds budget, character-slice its content.
  4. Return survivors in their original insertion order.

U-shape ordering

From truncation.py:order_by_position:

The U-shape algorithm exploits the well-documented primacy/recency bias in transformer attention: models attend most to the beginning and end of context, with a "lost in the middle" valley. By placing the highest-priority block first and second-highest last, Bonfire maximizes attention on what matters.

Why no competitor ships this

As of April 2026:

FrameworkPrompt managementPriority semantics
CrewAIRaw templated contextNone
LangGraphChatPromptTemplate, position-basedNone
AutoGen"Summarize if over context" fallbackNone (lossy, unpredictable)
Semantic KernelPromptTemplateConfig, position-basedNone
BonfirePromptBlock with explicit priorityYes — enforced at block-construction seam

The gap is not "nobody thought of it." The gap is "nobody built the infrastructure to make priority semantics meaningful." Priority truncation only works if every consumer of PromptBlock tags priority at write time. Bonfire's axiom metadata system enforces this at the block-construction seam.

Known limitation

The estimate_tokens() function uses a chars/4 heuristic, not a real tokenizer. A 15% margin in effective_budget() compensates, but the sign of the error is not derivable from the heuristic alone — chars/4 over-estimates for code-heavy prompts nearly as often as it under-estimates.

When cost receipts are toggled on, this limitation is labeled honestly inside the receipt surface itself: tokens: chars/4 heuristic +15% margin. A future upgrade to real tokenizer counts is planned.

On this page