Model Configuration

Configure AI models for optimal performance.

Setting Up Models

Per-Agent Configuration

Each agent can use a different model:

Agent: Research Assistant
Framework: openai
Model: gpt-4o
Settings:
  temperature: 0.3
  max_tokens: 4096

Agent: Creative Writer
Framework: anthropic
Model: claude-sonnet-4-20250514
Settings:
  temperature: 0.9
  max_tokens: 8192

Default Model

Set a default model in Settings > Integrations:

Go to Settings
Select Integrations
Choose your primary provider
Select default model

Configuration Options

Temperature

Controls randomness in responses:

Low (0.0-0.3):
├── More focused, deterministic
├── Good for: factual queries, code
└── Example: Math problems, data extraction

Medium (0.4-0.7):
├── Balanced creativity
├── Good for: general chat, explanations
└── Example: Q&A, summaries

High (0.8-1.0):
├── More creative, varied
├── Good for: brainstorming, writing
└── Example: Story writing, ideation

Max Tokens

Limit response length:

Tokens	Approximate Words	Use Case
256	~190	Very short responses
512	~380	Short answers
1024	~770	Standard responses
2048	~1,500	Detailed responses
4096	~3,000	Long-form content
8192+	~6,000+	Very long documents

Top P (Nucleus Sampling)

Alternative to temperature:

top_p: 1.0  → Consider all tokens
top_p: 0.9  → Consider top 90% probability
top_p: 0.5  → Consider top 50% probability

Note: Use either temperature OR top_p, not both.

Provider-Specific Settings

OpenAI

{
  "model": "gpt-4o",
  "temperature": 0.7,
  "max_tokens": 4096,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}

Anthropic

{
  "model": "claude-sonnet-4-20250514",
  "temperature": 0.7,
  "max_tokens": 4096,
  "top_p": 0.9,
  "top_k": 40
}

Google AI

{
  "model": "gemini-1.5-pro",
  "temperature": 0.7,
  "maxOutputTokens": 4096,
  "topP": 0.9,
  "topK": 40
}

Optimization Tips

For Speed

Model: gpt-4o-mini or claude-3-haiku
max_tokens: 1024 (limit response length)
temperature: 0.3 (more focused)

For Quality

Model: gpt-4o or claude-sonnet-4-20250514
max_tokens: 4096+ (allow detailed responses)
temperature: 0.5 (balanced)

For Cost

Model: gpt-3.5-turbo or gemini-1.5-flash
max_tokens: 512 (shorter responses)

For Creativity

Model: claude-sonnet-4-20250514
temperature: 0.9
max_tokens: 8192

Fallback Configuration

Set up fallback models for reliability:

Primary:
  provider: openai
  model: gpt-4o

Fallback 1:
  provider: anthropic
  model: claude-sonnet-4-20250514

Fallback 2:
  provider: google
  model: gemini-1.5-pro

If the primary model fails, the system automatically tries fallbacks.

Model Configuration

Model Configuration

Setting Up Models

Per-Agent Configuration

Default Model

Configuration Options

Temperature

Max Tokens

Top P (Nucleus Sampling)

Provider-Specific Settings

OpenAI

Anthropic

Google AI

Optimization Tips

For Speed

For Quality

For Cost

For Creativity

Fallback Configuration

Cookie Preferences