Cut Your AI Agent Costs by 97% with Multi-Model Routing
One user dropped from $90/month to $6 by routing tasks to the right model. Here's the exact setup.
What You'll Build
A model routing strategy that uses cheap models for simple tasks and expensive models only when needed. Your AI agent stays just as capable but costs 90-97% less.
Why It Works
Most people set their AI agent to use Claude Opus (the most powerful, most expensive model) for everything. Every heartbeat check, every simple question, every status ping costs premium tokens. That's like hiring a brain surgeon to take your temperature.
Results: Monthly bill dropped from $90 to $6. Same agent. Same capabilities. 97% cost reduction.
Prerequisites
- OpenClaw installed and running
- Access to your OpenClaw config
- That's it. This is a config change, not a build.
The Problem
Here's what's burning your money:
| Task | Frequency | Tokens Per Call | Model Used | Monthly Cost |
|---|---|---|---|---|
| Heartbeat check | Every 30 min | 5,000-15,000 | Opus | $30-45 |
| Simple replies | 20/day | 2,000-5,000 | Opus | $15-25 |
| Complex tasks | 2-3/day | 10,000-30,000 | Opus | $10-15 |
| Background jobs | 5-10/day | 5,000-15,000 | Opus | $10-20 |
| Total | $65-105 |
The heartbeat alone (a simple "anything new?" check) eats 30-50% of your budget.
The Solution: Tiered Model Routing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Task Incoming โ
โ โ
โ Simple? โโโโโโโถ Haiku ($0.25/M tokens) โ
โ (heartbeats, - Heartbeat checks โ
โ status, - Yes/no questions โ
โ basic) - Simple lookups โ
โ โ
โ Medium? โโโโโโโถ Sonnet ($3/M tokens) โ
โ (writing, - Email drafts โ
โ analysis, - Content creation โ
โ planning) - Research summaries โ
โ โ
โ Complex? โโโโโโถ Opus ($15/M tokens) โ
โ (strategy, - Complex analysis โ
โ coding, - Multi-step reasoning โ
โ critical) - Code generation โ
โ โ
โ Local? โโโโโโโโถ Ollama (FREE) โ
โ (private, - Personal data processing โ
โ offline, - Simple classification โ
โ simple) - Drafts nobody sees โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Step-by-Step Setup
Step 1: Set Your Default Model to Sonnet
In your OpenClaw config, change the default model from Opus to Sonnet:
{
"model": "anthropic/claude-sonnet-4-20250514"
}
This single change cuts your baseline costs by ~80%. Sonnet handles 90% of daily tasks perfectly well.
Step 2: Configure Heartbeats to Use Haiku
Heartbeats are the biggest cost driver. They run every 30 minutes and usually just check if anything needs attention. Haiku is perfect for this.
In your OpenClaw config, set the heartbeat model:
{
"heartbeat": {
"model": "anthropic/claude-haiku"
}
}
Cost impact: Heartbeat cost drops from $30-45/month to under $1.
Step 3: Use Opus Only for Complex Tasks
Reserve Opus for tasks that genuinely need it:
- Overnight coding jobs: Building features, fixing complex bugs
- Strategic analysis: Business decisions, competitive research
- Long-form writing: Blog posts, reports, proposals
You can override the model per-session or per-task:
/model opus
Or set specific cron jobs to use Opus:
{
"payload": {
"kind": "agentTurn",
"message": "Complex overnight task...",
"model": "anthropic/claude-opus-4-6"
}
}
Step 4: Add Local Ollama (Optional, Saves More)
For tasks that don't need cloud AI at all, run a local model:
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a small, fast model
ollama pull llama3.2:3b
Use cases for local models:
- Classifying emails (spam/important/can-wait)
- Extracting structured data from text
- Simple yes/no decisions
- Drafts that will be rewritten anyway
Cost: $0. Runs on your hardware.
Step 5: Monitor and Adjust
After a week, check your usage:
/status
Look at:
- Which tasks are using which models
- Where Sonnet struggled (upgrade those to Opus)
- Where Opus was overkill (downgrade those to Sonnet or Haiku)
The Math
| Task | Before (All Opus) | After (Routed) |
|---|---|---|
| Heartbeats (48/day) | $35/month | $0.50/month |
| Simple tasks (20/day) | $20/month | $2/month |
| Medium tasks (10/day) | $20/month | $3/month |
| Complex tasks (2/day) | $15/month | $3/month |
| Total | $90/month | $8.50/month |
That's a $81.50/month savings. $978/year. For changing a config file.
Customization Ideas
- Cost-optimized: Use Haiku as default, Sonnet for writing, Opus only for strategy sessions
- Quality-optimized: Use Sonnet as default, Opus for anything customer-facing
- Privacy-optimized: Use Ollama for all personal data, cloud models for business tasks only
- Hybrid: Run a local model for classification/routing, then send to the right cloud model
Gotchas & Tips
- Don't start with Haiku for everything. Start with Sonnet as default and Haiku for heartbeats only. Expand after you see what Haiku handles well.
- Haiku occasionally misses nuance. If your heartbeat checks need to understand complex context, Sonnet might be worth the extra cost.
- Model names change. Check the current model names in your provider's docs.
- Cache helps too. OpenClaw caches prompt context. Longer sessions with fewer restarts = lower costs.
- The 80/20 rule applies. 80% of your tasks need 20% of the compute. Route accordingly.