Research
Context Compression13 min read

State-preserving context compression

Full history is expensive. Generic summaries are unsafe. Aionis compresses long execution history into governed state that keeps the next agent on the accepted route.

Aionis does not optimize for the shortest possible summary. It optimizes for compact context that preserves current state, blocked branches, reusable procedure, rehydrate pointers, and an audit trail.

The problem with full history

The easiest way to preserve execution state is to pass the entire history forward. That works until the history becomes the product bottleneck. Long-running coding agents accumulate plans, failed attempts, verifier output, file diffs, stale notes, and handoff messages. Full history keeps all of it, including the pieces the next agent should not treat as current instruction.

The stronger baseline is not no memory. The stronger baseline is full history, because a capable model can often read the whole transcript and infer the current route. The problem is cost and reliability under noise. In the runner-fixed GLM external-agent run, full history matched Aionis on route safety, but it used 2,060,529 prompt tokens across the scored rows. Aionis used 604,816 prompt tokens while preserving the same 0% wrong write, 0% wrong attention, 100% accepted direction, and 100% action completion on those rows.

That is the product claim that matters: Aionis is not trying to beat full history by knowing more. It is trying to retain the execution state full history carries, while removing the context mass that makes long-horizon agents expensive and brittle.

Why ordinary summaries are not enough

A normal summary compresses language. Aionis compresses state. That difference is not cosmetic. If an agent tried route A, failed verification, moved to route B, and accepted route B, the next prompt needs more than a short story. It needs the current route, the retired route, the reason route A should not be used as direction, the target files or symbols that define route B, and a pointer to raw evidence if the compact packet is not enough.

Generic summaries tend to blur exactly those boundaries. They preserve the narrative while weakening the action contract. Raw retrieval has the opposite failure mode: it can bring back the most semantically related memory even when that memory is stale, failed, or contested. In both cases, the agent receives text, but not a governed execution state.

State-preserving compression treats failed branches, stale premises, and rehydrate pointers as first-class objects. It does not delete risky evidence. It changes how that evidence is allowed to influence the next action.

The Aionis compression contract

Aionis compiles memory into four action surfaces. use_now carries accepted state. inspect_before_use keeps uncertain evidence visible but gated. do_not_use prevents unsafe memory from becoming instruction. rehydrate points to raw detail that can be restored on demand.

That contract gives compression a different shape. The runtime is not just deciding what to omit. It is deciding which remembered items are allowed to act, which are only evidence, and which need raw expansion before the agent edits code or continues a workflow.

This is why Aionis can sit behind SDK, API, CLI, or MCP without becoming the agent runner. The agent still acts. Aionis decides what execution memory is safe and sufficient enough to enter the next prompt.

Route evidence and executable evidence are different

One of the useful findings from the external-agent work was that route correctness and action executability are separate budgets. A compact prompt can be good enough to tell the agent where to go, but still too thin to tell it exactly how to patch the target.

That distinction changed the design. Route evidence answers: which branch survived, which target is active, and which path should not be extended. Executable evidence answers: what symbol, region, patch anchor, acceptance check, or file creation intent is needed to act safely. If executable evidence is missing, the correct behavior is not to guess. The correct behavior is to rehydrate the minimal raw payload.

This is the point of state-preserving context compression. The goal is not to blindly shrink context. The goal is to keep route safety while adding just enough action evidence for the model in front of you. A conservative model may need stronger patch evidence. A stronger model may need less. The runtime contract should expose that boundary instead of hiding it inside a summary.

Compression suite results

The deterministic 100-scenario state-preserving compression suite compared full history, naive summary, raw retrieval, and Aionis. Aionis produced a mean context size of 610.95 characters versus 2,735.55 for full history, a 77.2% compression rate. At the same time it preserved 100% current-state recall, 100% negative-memory recall, and 100% procedure retention, with 0% stale leak and 0% forbidden leak.

The LLM-scored 24-scenario subset showed the same direction under downstream action scoring. Aionis reached 95.8% downstream action accuracy with 76.9% compression, 0% stale leak, 0% forbidden leak, and 100% audit coverage. Full history scored 75.0% downstream accuracy in that subset, with 12.5% stale leak and 100% forbidden leak because the whole history was visible without admission routing.

The 200-scenario unlabelled lifecycle holdout removed explicit lifecycle labels and scrubbed cue wording. Aionis Runtime still recovered 96.5% current-state recall, 96.8% use-now recall, 98.0% negative recall, 100% stale recall, 97.0% procedure retention, and 100% rehydrate recall. It was not perfect: forbidden direct-use leakage remained at 2.0%. That failure bucket matters because the point of this evaluation is not to pretend compression is solved. The point is to measure where governed compression holds and where it still needs work.

External agent results

The stronger product test is not a synthetic compression score. It is whether a real agent can continue work across a hard episode cut using less context. In the 40-record five-arm external-agent run, Aionis was compared against no memory, full history, BM25 retrieval, and Mem0 under the same trap manifest.

In the DeepSeek route-contract run, Aionis reached 0% wrong write, 0% wrong attention, and 100% accepted direction with 219,917 prompt tokens. Full history also reached 0% wrong write, 0% wrong attention, and 100% accepted direction, but used 985,523 prompt tokens. BM25 used 344,280 prompt tokens, and Mem0 used 624,873.

The GLM runner-fixed run is a cleaner statement of the current claim. On the scored rows, full history, BM25, Mem0, and Aionis all reached 0% wrong write, 0% wrong attention, 100% accepted direction, and 100% action completion. Aionis used 604,816 prompt tokens, compared with 2,060,529 for full history, 724,652 for BM25, and 1,317,009 for Mem0. In that run, Aionis did not win by being uniquely correct. It won by matching the strongest route-safety behavior with much less prompt context.

At the buried-history level, the stress point was even clearer. Aionis used 144,897 prompt tokens across the scored buried rows while matching 0% wrong write, 0% wrong attention, 100% accepted direction, and 100% action completion. Full history used 1,183,803 prompt tokens for the same safety and completion result.

What this proves and what it does not

The evidence supports a precise claim: Aionis can keep long-running agents on the accepted execution route with far smaller governed context than full-history transfer, while preserving auditability and rehydrate hooks.

The evidence does not support the stronger claim that Aionis uniquely prevents every wrong-branch write. In the latest memory-bearing runs, full history and retrieval baselines also avoided wrong writes. That is not a problem for the product. It simply means the better story is not fear-based. The better story is context-cost robustness: full-history-level route safety without full-history-level context.

This distinction matters for users. If your agent only runs short single-session tasks, full history may be enough. If your agent crosses sessions, hands work between planner, worker, verifier, and reviewer, or carries weeks of execution residue, the problem changes. You need the state that matters, the blocked routes that should not act, and the raw evidence pointer when compact context is not enough.

Why auditability belongs inside compression

Aionis compression is not just smaller text. Every admitted or suppressed item can be connected to a decision trace, receipt, and feedback attribution. That is what lets the runtime answer a question ordinary summaries cannot answer: why did this memory influence the agent, and what happened after it did?

That audit trail is also how the system improves. Once the runtime records which compact state was exposed, which memory was used, and which outcome followed, compression stops being a one-shot prompt trick. It becomes a measurable admission policy.

The long-term path is a correctness-aware budget controller: keep the prompt small by default, preserve route evidence always, add executable evidence only when needed, and rehydrate raw payload only when compact context is insufficient. That is the version of compression Aionis is built around.

The product takeaway

State-preserving context compression is the most concrete Aionis product claim today. It is easier to verify than a broad statement about agent intelligence, and it maps directly to a real operating cost: agents get slower, more expensive, and less reliable when every session inherits the whole past.

Aionis gives the next agent a smaller object with stronger semantics: current route, blocked branch, procedure memory, action evidence, rehydrate pointer, and audit trail. That is why it is execution memory, not chat history.

The practical promise is simple: keep the useful state, drop the noise, preserve the evidence, and make the next run cheaper to reason about.