Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
-
Updated
May 22, 2026 - Python
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.
Up to 71.5x fewer tokens per session on Claude Code with Obsidian + Graphify. Persistent memory, codebase knowledge graphs, and chat import pipeline. 🇧🇷 PT-BR included.
Working memory for Claude Code - persistent context and multi-instance coordination
Open-source context engine that catches AI hallucinations and cuts your token bill 70–95%. The only AI helper that shows its work. Claude · Cursor · Codex,GPT & Custom Providers
An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.
Your agents are guessing at APIs. Give them the actual Agent-Native spec. 1500+ API's Ready To-Use skills, Compile any API spec into a lean, agent-native format. 10× smaller. OpenAPI, GraphQL, AsyncAPI, Protobuf, Postman.
Pith is the hook that makes Claude Code sessions last 3x longer.
Claude Code usage governor: compact professional output, context slimming, tool-output filtering, telemetry, and drift guardrails.
A small Recursive Language Model: let any LLM run code on its context instead of stuffing it into the prompt.
State aware knowledge compression, ingestion, and hybrid retrieval engine. Zero dependencies. Sub-100ms queries.
OCTAVE protocol - structured AI communication with 3-20x token reduction. MCP server with lenient-to-canonical pipeline and schema validation.
Claude Code plugin that offloads large outputs to filesystem and retrieves when required.
🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.
Multi-model AI workflow engine with built-in quality arbitration. Plan → Approve → Execute → Audit pipeline. Dual-brain design: Brain #1 (planner) + Brain #2 (arbiter, different model) stops hallucination leaks cold. Zero-config token saving.
🦞 LobsterPress(龙虾饼) - Cognitive Memory System for AI Agents 基于认知科学的 LLM 永久记忆引擎
Give your Claude Code Agent Teams a memory. Auto-injects role-specific context into every new teammate — your team never starts blind again.
The meta-architecture behind a high-leverage Claude Code setup. Installed by the very tool it optimizes.
Token-compression skill. An adaptation of caveman — short common words, trust context, say just enough, be laconic.
📊 Per-tool-call context window analyzer for Claude Code
Add a description, image, and links to the token-optimization topic page so that developers can more easily learn about it.
To associate your repository with the token-optimization topic, visit your repo's landing page and select "manage topics."