Model Management
Usage & Costs
Usage & Costs
Monitor and manage your AI usage and costs.
Usage Dashboard
Access your usage data at Settings > Usage.
Metrics Tracked
| Metric | Description |
|---|---|
| Input Tokens | Tokens sent to models |
| Output Tokens | Tokens received from models |
| Total Cost | Estimated cost in USD |
| Requests | Number of API calls |
| Latency | Average response time |
Cost Calculation
Formula
Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)Example
Using GPT-4o:
- Input: 1,000 tokens × $2.50/1M = $0.0025
- Output: 500 tokens × $10.00/1M = $0.005
- Total: $0.0075
Usage Breakdown
By Agent
See which agents consume the most resources:
Agent Usage (Last 30 Days)
├── Research Assistant: 2.5M tokens ($8.25)
├── Code Helper: 1.8M tokens ($6.40)
├── Writer: 1.2M tokens ($4.80)
└── Coordinator: 0.5M tokens ($1.75)By Swarm
Track costs per project:
Swarm Usage (Last 30 Days)
├── Product Dev: 3.2M tokens ($11.50)
├── Customer Support: 1.5M tokens ($3.25)
└── Research Project: 1.3M tokens ($7.45)By Model
Compare model costs:
Model Usage (Last 30 Days)
├── gpt-4o: 2.1M tokens ($15.75)
├── claude-sonnet: 1.8M tokens ($8.10)
├── gpt-4o-mini: 2.0M tokens ($0.90)
└── gemini-flash: 0.6M tokens ($0.05)Cost Optimization
1. Right-Size Your Models
Use cheaper models for simple tasks:
| Task | Model | Savings |
|---|---|---|
| Simple Q&A | GPT-4o Mini | 94% vs GPT-4o |
| Quick summaries | Claude Haiku | 92% vs Claude Sonnet |
| High volume | Gemini Flash | 97% vs Gemini Pro |
2. Optimize Prompts
Shorter prompts = lower costs:
Before (150 tokens):
"I would like you to please help me by summarizing
the following text. Please make sure to include all
the key points and main ideas..."
After (30 tokens):
"Summarize the key points from this text:"3. Cache Responses
For repeated queries, implement caching:
const cache = new Map();
async function getResponse(prompt) {
if (cache.has(prompt)) {
return cache.get(prompt);
}
const response = await callModel(prompt);
cache.set(prompt, response);
return response;
}4. Set Token Limits
Prevent unexpectedly long responses:
Agent Settings:
max_tokens: 2048 # Cap response lengthBudget Alerts
Set up alerts to monitor spending:
- Go to Settings > Usage
- Click Set Budget Alert
- Configure thresholds:
Alerts:
- threshold: $50
notify: email
- threshold: $100
notify: email + dashboard
- threshold: $200
action: pause_agentsUsage Reports
Export Data
Download usage data for analysis:
- Go to Settings > Usage
- Select date range
- Click Export CSV
Report Contents
date,agent_id,swarm_id,model,input_tokens,output_tokens,cost,latency_ms
2024-01-15,agent_123,swarm_456,gpt-4o,1500,800,0.0125,1234
2024-01-15,agent_789,swarm_456,claude-3,2000,1200,0.0186,987