Skip to content

Active compression

Active compression is the set of features that shrink content as it enters your context window, rather than after the fact. This page is the map; each feature has its own page with the full detail.

The setup audit reduces the overhead you start a session with. Active compression reduces the overhead you accumulate during the session: redundant file re-reads, verbose command output, and large tool results that survive into compaction. Each feature targets one source of mid-session bloat and each is on by default where it is safe to be.

Three of these features are documented in depth on their own pages. This page exists so you can see them together, understand their defaults at a glance, and learn the single command that manages the whole suite.

FeatureDefaultTargetsTypical savingRiskPage
Read cache (delta mode)OnRe-reading an unchanged or barely-changed fileA 2,000-token re-read becomes a 50-token diffLowRead cache
Read cache (structure map)On (soft-block)Re-reading a large code fileA 720KB file becomes a 250-token skeletonLowRead cache
Bash output compressionOnVerbose read-only CLI outputA 564-token pytest run becomes 115 tokensLowBash compression
Quality nudgesOnA sudden quality drop you should act onA timely compact instead of a degraded contextNoneQuality signals
Loop detectionOnThe agent stuck retrying the same thingCaught before it burns more turnsNoneQuality signals

One command controls the runtime compression features as a group. The CLI verb is v5, a legacy command name kept for compatibility. It is the verb you type; it is not a version label for the feature.

Terminal window
cd ~/.claude/skills/token-optimizer/scripts
# Show the status of every active compression feature
python3 measure.py v5 status
# Turn one feature on or off
python3 measure.py v5 disable delta_mode
python3 measure.py v5 enable delta_mode
# Full detail for one feature
python3 measure.py v5 info delta_mode
# First-run welcome and summary
python3 measure.py v5 welcome

Feature names accepted by enable and disable: bash_compress, delta_mode, structure_map_beta, quality_nudges, loop_detection. Toggle state persists to config.json.

Every active compression feature is on by default on Claude Code, where the PreToolUse and PostToolUse hooks can intercept tool calls. On platforms with reduced hook support, some features are approximated or unavailable; the capability matrix lists the per-platform reality.

Turn off the whole behavior of one feature with its v5 disable verb above, or with its environment variable for a one-off. The two read-cache behaviors share TOKEN_OPTIMIZER_READ_CACHE (both off) and TOKEN_OPTIMIZER_READ_CACHE_DELTA (delta only). Bash compression uses TOKEN_OPTIMIZER_BASH_COMPRESS. Quality nudges and loop detection use TOKEN_OPTIMIZER_QUALITY_NUDGES and TOKEN_OPTIMIZER_LOOP_DETECTION. All are defined in the configuration reference.

Low overall. Each compression feature fails open: when a substitution might lose information the model needs, the feature serves the full content instead. The two warning features (quality nudges, loop detection) only add a note for the model to read, so their risk is none. Per-feature failure modes are documented on each page.

TOKEN_OPTIMIZER_BASH_COMPRESS, TOKEN_OPTIMIZER_READ_CACHE, TOKEN_OPTIMIZER_READ_CACHE_DELTA, TOKEN_OPTIMIZER_STRUCTURE_MAP, TOKEN_OPTIMIZER_QUALITY_NUDGES, TOKEN_OPTIMIZER_LOOP_DETECTION. All defined in the configuration reference.

Full suite on Claude Code CLI and VS Code. Approximated or partial on Codex, Copilot, Hermes, OpenClaw, and OpenCode depending on each platform’s hook capabilities. See the capability matrix.