SuperAgent v2.2
Cost-aware AI routing brain. 8 platforms. ~95% token savings on real workloads. Public release this week — MIT licensed.
There's a quiet line item on every AI engineer's monthly statement: tokens you didn't have to spend. The model re-read a file you'd already shown it. The agent ran a search you'd already run. A subagent loaded a 4,000-line file to answer a one-line question. None of it shows up in your IDE. It shows up at the end of the month, on a bill, with no good way to investigate.
What SuperAgent is
SuperAgent is a routing brain that sits between me and the AI tool I'm using. It reads my intent, scores it against every skill it knows about, and picks a chain.
The chain is what matters. Most "agent frameworks" pick a model and let that model do everything. SuperAgent picks a skill sequence and lets each skill run with its own minimum-viable model.
Intent → skill chain → minimum-viable model per step. That's the whole product. Everything else is plumbing.
What v2.2 added
Two big things shipped in this release:
1. Multi-domain expansion
Earlier versions of SuperAgent only knew about coding skills — TDD, debugging, code review, etc. v2.2 added content skills (writing, editing, research), product skills (CEO review, eng review, design review), and ops skills (deploy, env vars, status checks).
The skill catalog is now ~50 skills. Each skill knows its own preferred model. A "TDD" run might route to Sonnet for the test design and Haiku for the implementation. A "design review" run might route to Opus for the critique and Haiku for the rewrites.
2. Cost-aware brain
The router doesn't just match keywords anymore. It reads:
- the user's monthly token budget (configurable)
- the running cost of the current session
- the model's historical price-to-quality on that exact skill class
If the budget is tight, the router downshifts. Sonnet jobs become Haiku. Multi-step chains become single-call with a tighter prompt. Below a configurable floor it switches to a free LLM (Groq Llama, local Ollama).
The ~95% number
This is the part everyone asks about. Here's where it comes from.
The reduction comes from three places:
- Graphify. Instead of re-reading a codebase every session, SuperAgent builds a compressed knowledge graph once and queries it. Codebase exploration that used to read 71 files now reads ~3.
- Mempalace. Cross-session memory. Past observations are recallable as IDs, not full re-reads.
- Cheapest viable model. As above — Haiku for simple, Sonnet for complex, free for cheap.
Works with: Claude Code, Cursor, Copilot, Codex, Gemini CLI…
The original v1 was Claude Code only. v2.2 ships with adapters for eight AI platforms. The skill files are platform-agnostic markdown — each tool gets a thin shim that registers them as native commands.
# Claude Code
git clone https://github.com/animeshbasak/SuperAgent
bash SuperAgent/install-universal.sh
# Then in your terminal:
/superagent "fix the hydration bug"
What I learned shipping it publicly
Two things I didn't expect:
The hardest part of going public wasn't the code. It was naming things consistently. "Skill" vs. "agent" vs. "subagent" vs. "command" vs. "tool" — every platform uses different vocabulary. Half the install script is a vocabulary translation table.
The other thing: cost-aware routing is more of a UX problem than a model-selection problem. The router can pick the perfect model, but if the user doesn't trust the routing decision, they'll force-override it back to Opus. The dashboard (every routing decision logged, every cost attributed, every override remembered) does more for adoption than the brain itself.
Use it
Public, MIT, no telemetry, no signup: github.com/animeshbasak/SuperAgent.
If you ship something with it, ping me — I'm looking for case studies for v2.3.
Why open source? Because the AI tooling space is moving so fast that closed-source agent infrastructure decays in 6 weeks. The only durable advantage is being the place developers come to when they need the routing decision documented. That's a community problem, not a moat problem.