Skip to content

Configuration

Token Optimizer reads configuration from two places: environment variables and a config.json file. Environment variables take precedence. When both set the same setting, the environment variable wins for that run, and config.json holds the persistent default.

config.json lives at ~/.claude/token-optimizer/config.json. It stores feature flags, consent status, the pricing tier, and timestamps. The v5 command and the various setup-* and consent commands write to it. Most users never edit it by hand; the commands keep it consistent.

The five active compression features each have a config.json key and an environment override. Setting the environment variable to 0 disables the feature for that run regardless of the persisted value.

VariableDefaultEffect
TOKEN_OPTIMIZER_BASH_COMPRESSenabledSet to 0 to disable Bash output compression for eligible read-only commands
TOKEN_OPTIMIZER_READ_CACHEenabledSet to 0 to disable the read cache entirely (no delta, no skeleton substitution)
TOKEN_OPTIMIZER_READ_CACHE_DELTAenabledSet to 0 to disable delta mode only (re-reads return full content instead of a diff)
TOKEN_OPTIMIZER_READ_CACHE_MODEsoft_blockSelect the read-cache mode: soft_block, warn, shadow, or block. See the modes table below.
TOKEN_OPTIMIZER_STRUCTURE_MAPsoft-block onlySet to beta to enable structure-map compression event logging. This is a separate measurement toggle, not the mode selector above.
TOKEN_OPTIMIZER_QUALITY_NUDGESenabledSet to 0 to disable the quality-drop alerter
TOKEN_OPTIMIZER_LOOP_DETECTIONenabledSet to 0 to disable the retry-loop detector
TOKEN_OPTIMIZER_FIRST_READ_ACTIVEenabledSet to 0 to disable active first-read skeleton serving for promoted cohorts

The matching config.json keys are: v5_bash_compress, read_cache_enabled, v5_delta_mode, v5_structure_map_beta, v5_quality_nudges, and v5_loop_detection. Manage them with measure.py v5 enable|disable <feature> rather than editing the file.

The read cache operates in one of four modes, selected with TOKEN_OPTIMIZER_READ_CACHE_MODE (for example TOKEN_OPTIMIZER_READ_CACHE_MODE=shadow). The default is soft_block.

ModeBehavior
soft_blockDefault. On an unchanged re-read, substitute a compact result (delta or structure map) instead of the full file. Always allows the read to proceed.
warnAllow the full re-read, but emit a warning that the file was already read. No substitution.
shadowMeasure what substitution would have saved without changing the result the model sees. Used for validation.
blockRefuse the redundant re-read outright. Most aggressive; available on platforms that support read interception (for example OpenClaw).
VariableDefaultEffect
TOKEN_OPTIMIZER_CONTEXT_SIZEauto-detectedOverride the assumed context window size for all commands. Equivalent to the --context-size flag.
TOKEN_OPTIMIZER_QUALITY_WINDOW20Rolling window size for ratio-based quality signals
TOKEN_OPTIMIZER_TOOL_CALL_WARNautoTool-call warning threshold; scales with the context window
TOKEN_OPTIMIZER_TOOL_CALL_CRITICALautoTool-call critical threshold; scales with the context window
TOKEN_OPTIMIZER_RELEVANCE_THRESHOLD0.3Minimum relevance score for a checkpoint to be restored
VariableDefaultEffect
TOKEN_OPTIMIZER_CHECKPOINT_RETENTION_DAYS7Days to keep checkpoints before cleanup
TOKEN_OPTIMIZER_CHECKPOINT_RETENTION_MAX50Maximum checkpoints to scan for restore
TOKEN_OPTIMIZER_CHECKPOINT_TELEMETRYoffSet to 1 to enable the checkpoint-stats telemetry summary
TOKEN_OPTIMIZER_PROGRESSIVE_CHECKPOINTSplatform-dependentControls progressive checkpoint capture at fill bands (where the platform supports it)
TOKEN_OPTIMIZER_SMART_COMPACTIONtrueSet to disable compaction context injection
TOKEN_OPTIMIZER_CONTINUITYtrueSet to disable session continuity

Quality, nudges, and activity (cross-platform aliases)

Section titled “Quality, nudges, and activity (cross-platform aliases)”

These shorter names are used by the TypeScript plugins (OpenCode, OpenClaw) and accepted as cross-platform aliases. They mirror the longer TOKEN_OPTIMIZER_* flags above.

VariableDefaultEffect
TOKEN_OPTIMIZER_NUDGEStrueEnable quality nudges
TOKEN_OPTIMIZER_LOOP_DETECTIONtrueEnable retry-loop detection
TOKEN_OPTIMIZER_ACTIVITYtrueEnable activity-mode tracking
TOKEN_OPTIMIZER_TRENDStrueEnable trends collection
VariableDefaultEffect
TOKEN_OPTIMIZER_HOST127.0.0.1Bind host for the dashboard server
TOKEN_OPTIMIZER_DASHBOARD_HOST127.0.0.1Dashboard-specific bind host; set to 0.0.0.0 for LAN access before setup-daemon
TOKEN_OPTIMIZER_DASHBOARD_TIMEOUT(engine default)Timeout for the dashboard server request handling

Dashboard ports are fixed per runtime: 24842 (Claude Code), 24843 (Codex), 24844 (Hermes), 24845 (Copilot). They are not configured by environment variable; they are assigned by the engine so multiple runtimes can serve dashboards simultaneously without conflict. The --port flag on dashboard overrides the port for a one-off serve.

VariableDefaultEffect
TOKEN_OPTIMIZER_TRENDS_RETENTION_DAYSunlimitedDays to keep rows in the history database
TOKEN_OPTIMIZER_ARCHIVE_RETENTION_HOURS24Hours to keep archived tool results
TOKEN_OPTIMIZER_QUALITY_CACHE_RETENTION_DAYS7Days to keep quality-cache snapshots

Per-session file caches are auto-deleted after 48 hours and are not configurable. See Your data and privacy for the full storage map.

Keep-Warm is opt-in and managed by the keepwarm-* commands; there are no plain feature-flag environment variables for it. Consent and billing mode are persisted in config.json and read by keepwarm-consent-status.

VariableDefaultEffect
TOKEN_OPTIMIZER_COPILOT_USD_PER_CREDIT0.01USD value of one Copilot AI credit, for cost display
TOKEN_OPTIMIZER_COPILOT_PREMIUM_RATE0.04USD per premium request
TOKEN_OPTIMIZER_COPILOT_CAPS_JSON(unset)JSON override of the Copilot capability map when upstream fixes outpace the shipped matrix, for example '{"pretooluse_ctx": true}'

The pricing tier drives every cost and savings calculation. Set it with measure.py pricing-tier <tier>; the value persists in config.json. The default is anthropic.

TierPricing basis
anthropicAnthropic API direct rates (default)
Vertex AI GlobalGoogle Vertex AI global rates
Vertex AI RegionalVertex AI regional rates (a regional premium applies)
AWS BedrockAWS Bedrock rates
subscriptionFlat-rate subscription; Keep-Warm dollar savings are not applicable and Keep-Warm stays off
VariableDefaultEffect
TOKEN_OPTIMIZER_RUNTIMEclaudeSelect the platform adapter: claude, codex, copilot, hermes
TOKEN_OPTIMIZER_SKIP_VERIFYoffSet to 1 to skip checksum verification during script install (not recommended)
TOKEN_OPTIMIZER_STAR_ASKenabledSet to 0 to disable the one-time GitHub star offer entirely
HERMES_HOME~/.hermesOverride the Hermes home directory for the Hermes adapter

The keys below are the persistent equivalents of the runtime flags. Edit through the commands rather than by hand; the table documents what each key holds.

KeyHoldsSet by
v5_bash_compressBash compression on/offv5 enable|disable bash_compress
read_cache_enabledRead cache on/offv5 and read-cache commands
v5_delta_modeDelta mode on/offv5 enable|disable delta_mode
v5_structure_map_betaStructure-map event logging on/offv5 enable|disable structure_map
v5_quality_nudgesQuality nudges on/offv5 enable|disable quality_nudges
v5_loop_detectionLoop detection on/offv5 enable|disable loop_detection
quality_bar_disabledSticky status-line opt-outsetup-quality-bar --uninstall
consent statusData-notice acknowledgmentconsent --grant|--reset
daemon consentBookmarkable-URL consentdaemon-consent --set
pricing tierActive pricing tierpricing-tier <tier>
keep-warm consent and billing modeKeep-Warm enablement statekeepwarm-enable, keepwarm-disable
  1. Command-line flags (for example --context-size) win for the single command.
  2. Environment variables win over config.json for the run.
  3. config.json holds the persistent default.
  4. Engine defaults apply when nothing else is set.

A value marked (verify) anywhere in these docs means the source did not state that detail definitively. None appear in this table; every default above is confirmed against the engine or the platform adapter source.