Skip to main content
There are two ways to put TELOS in front of your model API. They run the same TELOS pipeline, the same state accumulation, and the same cache_control injection — the only differences are the process boundary, error handling, and streaming.

Path B · HTTP reverse proxy

Recommended. Zero code changes — set ANTHROPIC_BASE_URL to the local gateway. This is what telos init configures.

Path A · SDK transport

In-process. Swap the LLM client for the TELOS transport; the duck interface is identical.

How they compare

Path B · ProxyPath A · SDK transport
IntegrationANTHROPIC_BASE_URL env var, zero code changeswap the client object
Process modelout-of-process (separate gateway)in-process
StreamingSSE-aware reverse proxydirect SDK streaming
Session statekeyed by session_id in an LRU registryone BridgeSessionState per transport instance
Best forany harness that respects ANTHROPIC_BASE_URL / OPENAI_BASE_URLapps you control the source of
Failure modenon-strict by default: degrades to passthroughraises in-process
Both paths share the same pure function — proxy/pipeline.py:process_anthropic_request(raw, ...) splits out parse → bridge → emit, eliminating any wire drift between the two.

A request, end to end (Path B, mode=both)

Pick one

  • You run a CLI agent (Claude Code, Codex, OpenClaw, Hermes). Use Path B — run telos init and you’re done. See Harness integration.
  • You’re building your own agent in Python. Either works; Path A keeps everything in-process. See SDK transport.