Model Management
Model Configuration
Model Configuration
Configure AI models for optimal performance.
Setting Up Models
Per-Agent Configuration
Each agent can use a different model:
Agent: Research Assistant
Framework: openai
Model: gpt-4o
Settings:
temperature: 0.3
max_tokens: 4096
Agent: Creative Writer
Framework: anthropic
Model: claude-sonnet-4-20250514
Settings:
temperature: 0.9
max_tokens: 8192Default Model
Set a default model in Settings > Integrations:
- Go to Settings
- Select Integrations
- Choose your primary provider
- Select default model
Configuration Options
Temperature
Controls randomness in responses:
Low (0.0-0.3):
├── More focused, deterministic
├── Good for: factual queries, code
└── Example: Math problems, data extraction
Medium (0.4-0.7):
├── Balanced creativity
├── Good for: general chat, explanations
└── Example: Q&A, summaries
High (0.8-1.0):
├── More creative, varied
├── Good for: brainstorming, writing
└── Example: Story writing, ideationMax Tokens
Limit response length:
| Tokens | Approximate Words | Use Case |
|---|---|---|
| 256 | ~190 | Very short responses |
| 512 | ~380 | Short answers |
| 1024 | ~770 | Standard responses |
| 2048 | ~1,500 | Detailed responses |
| 4096 | ~3,000 | Long-form content |
| 8192+ | ~6,000+ | Very long documents |
Top P (Nucleus Sampling)
Alternative to temperature:
top_p: 1.0 → Consider all tokens
top_p: 0.9 → Consider top 90% probability
top_p: 0.5 → Consider top 50% probabilityNote: Use either temperature OR top_p, not both.
Provider-Specific Settings
OpenAI
{
"model": "gpt-4o",
"temperature": 0.7,
"max_tokens": 4096,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0
}Anthropic
{
"model": "claude-sonnet-4-20250514",
"temperature": 0.7,
"max_tokens": 4096,
"top_p": 0.9,
"top_k": 40
}Google AI
{
"model": "gemini-1.5-pro",
"temperature": 0.7,
"maxOutputTokens": 4096,
"topP": 0.9,
"topK": 40
}Optimization Tips
For Speed
Model: gpt-4o-mini or claude-3-haiku
max_tokens: 1024 (limit response length)
temperature: 0.3 (more focused)For Quality
Model: gpt-4o or claude-sonnet-4-20250514
max_tokens: 4096+ (allow detailed responses)
temperature: 0.5 (balanced)For Cost
Model: gpt-3.5-turbo or gemini-1.5-flash
max_tokens: 512 (shorter responses)For Creativity
Model: claude-sonnet-4-20250514
temperature: 0.9
max_tokens: 8192Fallback Configuration
Set up fallback models for reliability:
Primary:
provider: openai
model: gpt-4o
Fallback 1:
provider: anthropic
model: claude-sonnet-4-20250514
Fallback 2:
provider: google
model: gemini-1.5-proIf the primary model fails, the system automatically tries fallbacks.