Skip to content

Context Pruning

Dynamic context pruning removes redundant tool results before each inference call, keeping the context window within model limits without losing critical history.

Context pruning is enabled by default when [options.pruning] or enabled is omitted. Set enabled = false to disable it while keeping the rest of the pruning settings available for later.

[options.pruning]
enabled = true
dedup_by_signature = true
supersede_writes = true
purge_errors = true
purge_errors_turns = 2
turn_protection_turns = 3
protected_tools = ["view"]
protected_file_patterns = ["*.md"]
OptionDefaultDescription
enabledtrueEnable context pruning; explicit false disables
dedup_by_signaturetrueKeep only the latest result for each tool+input combination; explicit false disables
supersede_writestrueDrop write-tool input after a later read of the same file; explicit false disables
purge_errorstrueDrop input for failed tool calls after purge_errors_turns turns; explicit false disables
purge_errors_turns5Number of turns after which failed tool input is dropped; 0 uses the default
turn_protection_turns3Never prune results from the N most recent turns; 0 uses the default
protected_tools[]Tool names never pruned regardless of other rules
protected_file_patterns[]File path globs never pruned
stale_file_eviction.enabledtrueReplace stale file-read results with compact stubs under context pressure; explicit false disables
ast_compress.enabledtrueReplace semi-stale file-read results with signatures-only compressed forms under context pressure; explicit false disables
type_inject_for_readfalseAppend type-inject block to view/read responses when available

Pruning runs immediately before each inference call. It does not modify the stored message history — it only affects what is included in the context window for the current turn. This means pruned messages can re-enter context in future turns if the pruning rules change.

After pruning and prompt assembly, BMO runs a final provider-budget preflight that includes the system prompt, retained messages, tool schemas, max-output headroom, model limit, and safety buffer. If the envelope is still too large, BMO compacts the session once, rebuilds the prompt, and retries. If recovery is exhausted, the run reports prompt_budget_exceeded as context pressure rather than quota or generic provider failure.

How pruning is applied
1. User prompt enters BMO User asks: "Summarize the current auth changes and suggest the safest next patch." BMO starts from the stored transcript, tool events, workspace signals, and current model budget.
2. Prompt context is shaped Recent turns stay protected. Duplicate reads collapse to the newest matching result. Superseded write inputs can be removed after later reads confirm file state. Protected tools and protected file patterns stay included.
3. Submitted prompt is bounded Final model request contains the system prompt, current user prompt, retained recent turns, retained tool evidence, and any adaptive context appendages. Pruned history remains stored, but is not sent in this inference window.
User input Transcript snapshot Pruning rules Budget check Provider prompt
Pruning is a per-inference projection. It changes the submitted prompt shape, not the durable session history.

For most long-running sessions:

[options.pruning]
enabled = true
dedup_by_signature = true
supersede_writes = true
turn_protection_turns = 3

This removes duplicate reads and superseded writes while always preserving the 3 most recent turns in full.

Adaptive Context selects which enhancer content (git, recent files, memory, etc.) to include each turn using run-outcome feedback; it complements pruning by focusing on the prepended context block rather than conversation history.

Context window strip
Older duplicate readPruned when a newer matching read is present.
Superseded writeInput can be removed after a later read confirms the file state.
Protected recent turnsAlways preserved inside the configured turn-protection window.
Protected tools/filesNever pruned when allowlisted as critical context.
Pruning changes what is sent to the model for the current turn; it does not delete stored history.