Real 6-turn session −92.3%. Cost reported in absolute $/query-resolved — ratios can be
gamed; dollars can’t. From the LEAP Lab @ Tsinghua University.
Quickstart
Install, connect, and watch the savings in three commands.
The Protocol
Three-color bands and monotonic append — why the cache cannot break.
Support Matrix
Harnesses, frontier models, and self-hosted inference frameworks.
Why TELOS
TELOS solves exactly two things.Push token efficiency to the limit
A 6-turn real session drops −92.3%; a controlled 48-call run drops −36.6% (net −$2.16).
Every cent is accounted for in absolute $/query-resolved.
Return context sovereignty to you
TelosIR is an engine-agnostic, serializable, portable context representation — your persona,
your tools, your 20-turn thread, all on the same stone tablet. Hand it to Claude today,
move it to DeepSeek tomorrow, run it on local vLLM tonight.Key capabilities
Three-color bands
Every block declares its cache lifetime at birth: PIN / FOLD / DROP.
Monotonic append
The prompt is an append-only stream — submitted bytes are never mutated.
Multi-engine adapters
Anthropic, OpenAI, DeepSeek, vLLM and SGLang from one IR.
RTK output filtering
An orthogonal layer that shrinks repetitive
tool_result tails.Savings dashboard
Offline HTML, savings pinned to absolute dollars.
Zero-intrusion proxy
Point
ANTHROPIC_BASE_URL at the local gateway — no code changes.2 a.m. — where did all the money go?
The counter climbs to 2,847,103 tokens, and the line above readscache_read: 0. All night, every
turn fed the same 4,000-token system prompt from scratch, billed at full price. Take the exact
same 6-turn conversation, drop it into TELOS, and flip two switches:
| Mode | raw input tokens | cache_read | Cost for 6 turns |
|---|---|---|---|
| passthrough (today’s default) | 24,151 | 0 | $0.3623 |
| with TELOS | 0 | 18,701 | $0.0281 (−92.3%) |
Three steps to save 90%
Connect
How it fits together
Learn more
Architecture
The three-layer harness → bridge → engine design.
SWE-bench A/B
Pre-registered study: half the input bill, no resolved-rate regression.
CLI Reference
Every
telos subcommand and flag.