token-optimization

Here are 182 public repositories matching this topic...

chopratejas / headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

Updated May 22, 2026
Python

alexgreensh / token-optimizer

Sponsor

Star

Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.

token-usage context-window claude-code token-optimization context-engineering claude-plugin claude-code-skill token-optimizer agentskills ghost-tokens

Updated May 21, 2026
Python

lucasrosati / claude-code-memory-setup

Star

Up to 71.5x fewer tokens per session on Claude Code with Obsidian + Graphify. Persistent memory, codebase knowledge graphs, and chat import pipeline. 🇧🇷 PT-BR included.

knowledge-graph obsidian zettelkasten developer-productivity second-brain ai-tools graphify claude-code token-optimization coding-agent

Updated May 5, 2026
Python

GMaN1911 / claude-cognitive

Star

Working memory for Claude Code - persistent context and multi-instance coordination

productivity developer-tools claude-ai context-management claude-code token-optimization

Updated Jan 17, 2026
Python

juyterman1000 / entroly

Star

Open-source context engine that catches AI hallucinations and cuts your token bill 70–95%. The only AI helper that shows its work. Claude · Cursor · Codex,GPT & Custom Providers

rust productivity open-source ai mcp cursor ai-agents claude rag llm chatgpt anthropic hallucination-detection context-compression mcp-server claude-code token-optimization llm-grounding ai-hallucination

Updated May 21, 2026
Python

elusznik / mcp-server-code-execution-mode

Star

An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.

python docker mcp orchestration agents code-execution claude podman anthropic agentic-ai model-context-protocol claude-code token-optimization

Updated Dec 5, 2025
Python

Lap-Platform / LAP

Star

Your agents are guessing at APIs. Give them the actual Agent-Native spec. 1500+ API's Ready To-Use skills, Compile any API spec into a lean, agent-native format. 10× smaller. OpenAPI, GraphQL, AsyncAPI, Protobuf, Postman.

Updated Mar 26, 2026
Python

abhisekjha / pith

Star

Pith is the hook that makes Claude Code sessions last 3x longer.

anthropic llm-tools claude-code token-optimization claude-code-plugin

Updated May 6, 2026
Python

0xhimanshu / governor

Star

Claude Code usage governor: compact professional output, context slimming, tool-output filtering, telemetry, and drift guardrails.

cli developer-tools ai-tools llm prompt-engineering claude-ai context-window claude-code token-optimization claude-code-plugin claude-skills

Updated May 19, 2026
Python

avilum / minrlm

Star

A small Recursive Language Model: let any LLM run code on its context instead of stuffing it into the prompt.

agent inference ai-agents inference-engine cost-optimization rlms rlm inference-api llm llm-inference token-economics latency-optimization token-optimization recursive-language-model minrlm

Updated May 21, 2026
Python

castnettech / mnemosyne

Star

State aware knowledge compression, ingestion, and hybrid retrieval engine. Zero dependencies. Sub-100ms queries.

python open-source developer-tools tfidf bm25 zero-dependencies code-retrieval llm context-compression token-optimization

Updated May 16, 2026
Python

elevanaltd / octave-mcp

Star

OCTAVE protocol - structured AI communication with 3-20x token reduction. MCP server with lenient-to-canonical pipeline and schema validation.

python ai mcp protocol llm model-context-protocol token-optimization

Updated May 20, 2026
Python

sheeki03 / Few-Word

Sponsor

Star

Claude Code plugin that offloads large outputs to filesystem and retrieves when required.

ai-agents claude-code token-optimization context-engineering claude-code-hooks claude-code-plugin claude-code-plugins claude-code-skills claude-code-skill

Updated Jan 23, 2026
Python

JacobHuang91 / prompt-refiner

Star

🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.

python machine-learning openai ai-agents cost-optimization rag llm prompt-engineering langchain anthropic function-calling prompt-optimization token-optimization

Updated Apr 12, 2026
Python

wdnmd1265 / ai-flow-architect

Star

Multi-model AI workflow engine with built-in quality arbitration. Plan → Approve → Execute → Audit pipeline. Dual-brain design: Brain #1 (planner) + Brain #2 (arbiter, different model) stops hallucination leaks cold. Zero-config token saving.