Context compression layer for AI agents. Compresses tool outputs, logs, RAG chunks, files, and conversation history before they reach the LLM, with reversible retrieval through MCP tools.
Claude Desktop config.json'a ekle
{
"mcpServers": {
"chopratejas-headroom": {
"command": "python",
"args": [
"-m",
"headroom"
]
}
}
} Kaynak kodu al ve yerel olarak çalıştır
git clone https://github.com/chopratejas/headroom.git ~/.mcp/headroom
cd ~/.mcp/headroom ██╗ ██╗███████╗ █████╗ ██████╗ ██████╗ ██████╗ ██████╗ ███╗ ███╗
██║ ██║██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔═══██╗██╔═══██╗████╗ ████║
███████║█████╗ ███████║██║ ██║██████╔╝██║ ██║██║ ██║██╔████╔██║
██╔══██║██╔══╝ ██╔══██║██║ ██║██╔══██╗██║ ██║██║ ██║██║╚██╔╝██║
██║ ██║███████╗██║ ██║██████╔╝██║ ██║╚██████╔╝╚██████╔╝██║ ╚═╝ ██║
╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝
The context compression layer for AI agents
60–95% fewer tokens · library · proxy · MCP · 6 algorithms · local-first · reversible
Docs · Install · Proof · Agents · Discord · llms.txt · Enterprise
AI agents / LLMs: read /llms.txt here, or fetch the live index / full docs blob.
Headroom compresses everything your AI agent reads — tool outputs, logs, RAG chunks, files, and conversation history — before it reaches the LLM. Same answers, fraction of the tokens.
Live: 10,144 → 1,260 tokens — same FATAL found.
compress(messages) in Python or TypeScript, inline in any appheadroom proxy --port 8787, zero code changes, any languageheadroom wrap claude|codex|cursor|aider|copilot in one commandheadroom_compress, headroom_retrieve, headroom_stats for any MCP clientheadroom learn — mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md Your agent / app
(Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own code…)
│ prompts · tool outputs · logs · RAG results · files
▼
┌────────────────────────────────────────────────────┐
│ Headroom (runs locally — your data stays here) │
│ ──────────────────────────────────────────────── │
│ CacheAligner → ContentRouter → CCR │
│ ├─ SmartCrusher (JSON) │
│ ├─ CodeCompressor (AST) │
│ └─ Kompress-base (text, HF) │
│ │
│ Cross-agent memory · headroom learn · MCP │
└────────────────────────────────────────────────────┘
│ compressed prompt + retrieval tool
▼
LLM provider (Anthropic · OpenAI · Bedrock · …)
headroom_retrieve if it needs them→ Architecture · CCR reversible compression · Kompress-base model card
# 1 — Install
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node / TypeScript
# 2 — Pick your mode
headroom wrap claude # wrap a coding agent
headroom proxy --port 8787 # drop-in proxy, zero code changes
# or: from headroom import compress # inline library
# 3 — See the savings
headroom perf
Granular extras: [proxy], [mcp], [ml], [code], [memory], [relevance], [image], [agno], [langchain], [evals]. Requires Python 3.10+.
Savings on real agent workloads:
| Workload | Before | After | Savings |
|---|---|---|---|
| Code search (100 results) | 17,765 | 1,408 | 92% |
| SRE incident debugging | 65,694 | 5,118 | 92% |
| GitHub issue triage | 54,174 | 14,761 | 73% |
| Codebase exploration | 78,502 | 41,254 | 47% |
Accuracy preserved on standard benchmarks:
| Benchmark | Category | N | Baseline | Headroom | Delta |
|---|---|---|---|---|---|
| GSM8K | Math | 100 | 0.870 | 0.870 | ±0.000 |
| TruthfulQA | Factual | 100 | 0.530 | 0.560 | +0.030 |
| SQuAD v2 | QA | 100 | — | 97% | 19% compression |
| BFCL | Tools | 100 | — | 97% | 32% compression |
Reproduce: python -m headroom.evals suite --tier 1 · Full benchmarks & methodology
| Agent | headroom wrap | Notes |
|---|---|---|
| Claude Code | ● | --memory · --code-graph |
| Codex | ● | shares memory with Claude |
| Cursor | ● | prints config — paste once |
| Aider | ● | starts proxy + launches |
| Copilot CLI | ● | starts proxy + launches |
| OpenClaw | ● | installs as ContextEngine plugin |
Any OpenAI-compatible client works via headroom proxy. MCP-native: headroom mcp install.
Headroom can route GitHub Copilot CLI subscription traffic through the local proxy:
headroom wrap copilot --subscription -- --model gpt-4o
This lets Headroom intercept OpenAI-compatible Copilot CLI requests and apply the same proxy compression pipeline before forwarding to GitHub Copilot’s hosted API. The wrapper resolves the account-specific Copilot API endpoint and prints it as COPILOT_PROVIDER_API_URL=... during launch.
Platform support note: macOS auth reuse via Copilot CLI Keychain storage has been smoke-tested. Windows Credential Manager, Linux Secret Service / secret-tool, and Docker/CI token-injection paths are implemented or planned as auth-discovery paths, but still need real OS validation before they should be considered fully vetted. For Docker and CI, prefer passing an explicit GITHUB_COPILOT_TOKEN or GITHUB_COPILOT_GITHUB_TOKEN rather than relying on host keychain access.
Great fit if you…
Skip it if you…
| Your setup | Hook in with |
|---|---|
| Any Python app | compress(messages, model=…) |
| Any TypeScript app | await compress(messages, { model }) |
| Anthropic / OpenAI SDK | withHeadroom(new Anthropic()) · withHeadroom(new OpenAI()) |
| Vercel AI SDK | wrapLanguageModel({ model, middleware: headroomMiddleware() }) |
| LiteLLM | litellm.callbacks = [HeadroomCallback()] |
| LangChain | HeadroomChatModel(your_llm) |
| Agno | HeadroomAgnoModel(your_model) |
| Strands | Strands guide |
| ASGI apps | app.add_middleware(CompressionMiddleware) |
| Multi-agent | SharedContext().put / .get |
| MCP clients | headroom mcp install |
headroom learn — plugin-based failure mining for Claude, Codex, Gemini.Headroom exposes one stable request lifecycle across compress(), the SDK, and the proxy:
Setup → Pre-Start → Post-Start → Input Received → Input Cached → Input Routed → Input Compressed → Input Remembered → Pre-Send → Post-Send → Response Received
on_pipeline_event(...).Provider and tool-specific behavior lives under headroom/providers/ so core orchestration stays focused on lifecycle, sequencing, and policy.
headroom/providers/claude, copilot, codex, openclawheadroom/providers/claude, gemini, plus shared backend/runtime dispatch in headroom/providers/registry.pywrap.py, client.py, cli/proxy.py, and proxy/server.py delegate provider-specific env shaping, API target normalization, backend selection, and transport dispatch.pip install "headroom-ai[all]" # Python, everything
npm install headroom-ai # TypeScript / Node
docker pull ghcr.io/chopratejas/headroom:latest
Granular extras: [proxy], [mcp], [ml] (Kompress-base), [code], [memory], [relevance], [image], [agno], [langchain], [evals]. Requires Python 3.10+.
Using pipx? Choose a supported interpreter explicitly:
pipx install --python python3.13 "headroom-ai[all]"
→ Installation guide — Docker tags, persistent service, PowerShell, devcontainers.
headroom learn — mines failed sessions, writes corrections to CLAUDE.md / AGENTS.md / GEMINI.md.
| Start here | Go deeper |
|---|---|
| Quickstart | Architecture |
| Proxy | How compression works |
| MCP tools | CCR — reversible compression |
| Memory | Cache optimization |
| Failure learning | Benchmarks |
| Configuration | Limitations |
Headroom runs locally, covers every content type, works with every major framework, and is reversible.
| Scope | Deploy | Local | Reversible | |
|---|---|---|---|---|
| Headroom | All context — tools, RAG, logs, files, history | Proxy · library · middleware · MCP | Yes | Yes |
| RTK | CLI command outputs | CLI wrapper | Yes | No |
| lean-ctx | CLI commands, MCP tools, editor rules | CLI wrapper · MCP | Yes | No |
| Compresr, Token Co. | Text sent to their API | Hosted API call | No | No |
| OpenAI Compaction | Conversation history | Provider-native | No | No |
Attribution. Headroom ships with the excellent RTK binary for shell-output rewriting —
git show --short, scopedls, summarized installers. Huge thanks to the RTK team; their tool is a first-class part of our stack, and Headroom compresses everything downstream of it. Headroom can also use lean-ctx as the selected CLI context tool; setHEADROOM_CONTEXT_TOOL=lean-ctxbefore runningheadroom wrap ....
git clone https://github.com/chopratejas/headroom.git && cd headroom
pip install -e ".[dev]" && pytest
Devcontainers in .devcontainer/ (default + memory-stack with Qdrant & Neo4j). See CONTRIBUTING.md.
Apache 2.0 — see LICENSE.
Up-to-date code documentation for LLMs and AI code editors.
Memory manager for AI apps and Agents using various graph and vector stores and allowing ingestion from 30+ data sources.
Universal AI bridge for Obsidian vaults using MCP. Provides safe read/write access to notes with 11 comprehensive methods for vault operations including search, batch operations, tag management, and frontmatter handling. Works with Claude, ChatGPT, and any MCP-compatible AI assistant.
Production-ready RAG platform combining Graph RAG, vector search, and full-text search. Best choice for building your own Knowledge Graph and for Context Engineering
A Model Context Protocol server for Mem0 that helps manage coding preferences and patterns, providing tools for storing, retrieving and semantically handling code implementations, best practices and technical documentation in IDEs like Cursor and Windsurf
Ingest anything from Slack, Discord, websites, Google Drive, Linear or GitHub into a Graphlit project - and then search and retrieve relevant knowledge within an MCP client like Cursor, Windsurf or Cline.