Supported Models

HIVE Protocol integrates with the leading AI model providers, giving you access to the most capable language models available. This guide covers all supported models, their capabilities, pricing, and recommendations for different use cases.

Provider Overview

┌─────────────────────────────────────────────────────────────────┐
│                    Supported AI Providers                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │   OpenAI    │  │  Anthropic  │  │  Google AI  │              │
│  │   ───────   │  │  ─────────  │  │  ─────────  │              │
│  │  GPT-4o     │  │  Claude 4   │  │  Gemini 1.5 │              │
│  │  GPT-4o Mini│  │  Claude 3.5 │  │  Pro/Flash  │              │
│  │  GPT-4 Turbo│  │  Claude 3   │  │             │              │
│  │  GPT-3.5    │  │  Opus/Haiku │  │             │              │
│  └─────────────┘  └─────────────┘  └─────────────┘              │
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐                               │
│  │   Ollama    │  │   Custom    │                               │
│  │   ───────   │  │   ───────   │                               │
│  │  Llama 3    │  │  Any OpenAI │                               │
│  │  Mistral    │  │  compatible │                               │
│  │  Qwen       │  │  endpoint   │                               │
│  └─────────────┘  └─────────────┘                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

OpenAI Models

OpenAI provides the GPT family of models, known for their versatility and strong performance across diverse tasks.

GPT-4o (Recommended)

The flagship OpenAI model with multimodal capabilities including vision and audio.

Property	Value
Model ID	gpt-4o
Context Window	128,000 tokens
Max Output	16,384 tokens
Input Cost	$2.50 / 1M tokens
Output Cost	$10.00 / 1M tokens
Knowledge Cutoff	October 2023
Multimodal	Yes (text, vision, audio)

Strengths:

Excellent reasoning and analysis
Strong code generation
Vision capabilities for image analysis
Fast response times for its capability level

Best For:

Complex multi-step reasoning
Code generation and review
Document analysis with images
Production applications requiring high quality

Example Use Cases:

Agent: Technical Architect
Model: gpt-4o
Tasks:
  - System design discussions
  - Code review with context
  - Technical documentation
  - Architecture diagrams analysis

GPT-4o Mini

A smaller, faster, and more affordable version of GPT-4o.

Property	Value
Model ID	gpt-4o-mini
Context Window	128,000 tokens
Max Output	16,384 tokens
Input Cost	$0.15 / 1M tokens
Output Cost	$0.60 / 1M tokens
Knowledge Cutoff	October 2023
Multimodal	Yes (text, vision)

Strengths:

93% cost reduction vs GPT-4o
Very fast responses
Still maintains strong reasoning
Large context window

Best For:

High-volume tasks
Quick responses where top-tier quality isn't critical
Cost-sensitive applications
Preprocessing and filtering

Example Use Cases:

Agent: Quick Responder
Model: gpt-4o-mini
Tasks:
  - Initial customer queries
  - Data extraction
  - Simple transformations
  - Classification tasks

GPT-4 Turbo

High-performance model with vision capabilities.

Property	Value
Model ID	gpt-4-turbo
Context Window	128,000 tokens
Max Output	4,096 tokens
Input Cost	$10.00 / 1M tokens
Output Cost	$30.00 / 1M tokens
Knowledge Cutoff	December 2023
Multimodal	Yes (text, vision)

Note: GPT-4o is generally recommended over GPT-4 Turbo for most use cases due to better price-performance ratio.

GPT-3.5 Turbo

Fast and economical for simpler tasks.

Property	Value
Model ID	gpt-3.5-turbo
Context Window	16,385 tokens
Max Output	4,096 tokens
Input Cost	$0.50 / 1M tokens
Output Cost	$1.50 / 1M tokens
Knowledge Cutoff	September 2021
Multimodal	No

Best For:

Simple queries and chat
Legacy applications
Very high volume, cost-sensitive tasks

Anthropic Models

Anthropic's Claude models are known for their strong reasoning, safety features, and excellent writing quality.

Claude Sonnet 4 (claude-sonnet-4-20250514)

The latest and most capable Sonnet model with exceptional reasoning.

Property	Value
Model ID	claude-sonnet-4-20250514
Context Window	200,000 tokens
Max Output	8,192 tokens
Input Cost	$3.00 / 1M tokens
Output Cost	$15.00 / 1M tokens
Knowledge Cutoff	April 2024
Multimodal	Yes (text, vision)

Strengths:

Exceptional reasoning and analysis
Excellent writing quality
Strong at following complex instructions
Best-in-class for nuanced tasks

Best For:

Complex analysis and research
High-quality content creation
Tasks requiring careful reasoning
Multi-step problem solving

Example Use Cases:

Agent: Research Analyst
Model: claude-sonnet-4-20250514
Tasks:
  - Market research synthesis
  - Competitive analysis
  - Strategic recommendations
  - Report writing

Claude 3.5 Sonnet

Balanced performance with strong coding capabilities.

Property	Value
Model ID	claude-3-5-sonnet-20241022
Context Window	200,000 tokens
Max Output	8,192 tokens
Input Cost	$3.00 / 1M tokens
Output Cost	$15.00 / 1M tokens
Knowledge Cutoff	April 2024
Multimodal	Yes (text, vision)

Strengths:

Excellent at coding tasks
Strong general-purpose capabilities
Good balance of speed and quality
Reliable instruction following

Best For:

Code generation and review
General-purpose tasks
Technical documentation
Data analysis

Claude 3 Opus

Highest capability model in the Claude 3 family.

Property	Value
Model ID	claude-3-opus-20240229
Context Window	200,000 tokens
Max Output	4,096 tokens
Input Cost	$15.00 / 1M tokens
Output Cost	$75.00 / 1M tokens
Knowledge Cutoff	August 2023
Multimodal	Yes (text, vision)

Strengths:

Highest quality outputs
Exceptional at complex reasoning
Best for creative tasks
Strong at nuanced understanding

Best For:

Tasks requiring absolute best quality
Complex creative projects
High-stakes content
Research and analysis

Note: Consider Claude Sonnet 4 as a more cost-effective alternative with comparable quality.

Claude 3 Haiku

Fast and efficient for simple tasks.

Property	Value
Model ID	claude-3-haiku-20240307
Context Window	200,000 tokens
Max Output	4,096 tokens
Input Cost	$0.25 / 1M tokens
Output Cost	$1.25 / 1M tokens
Knowledge Cutoff	August 2023
Multimodal	Yes (text, vision)

Strengths:

Very fast responses
Low cost
Large context window
Still capable for many tasks

Best For:

Quick responses
High-volume processing
Classification tasks
Simple queries

Example Use Cases:

Agent: Triage Bot
Model: claude-3-haiku-20240307
Tasks:
  - Initial message classification
  - Quick fact lookup
  - Simple data extraction
  - Routing decisions

Google AI Models

Google's Gemini models offer massive context windows, making them ideal for processing long documents.

Gemini 1.5 Pro

Advanced model with the largest context window available.

Property	Value
Model ID	gemini-1.5-pro
Context Window	1,000,000 tokens
Max Output	8,192 tokens
Input Cost	$1.25 / 1M tokens
Output Cost	$5.00 / 1M tokens
Multimodal	Yes (text, vision, audio, video)

Strengths:

Massive 1M token context
Can process entire codebases
Video and audio analysis
Strong long-form reasoning

Best For:

Processing entire books or codebases
Long document analysis
Video/audio content analysis
Tasks requiring extensive context

Example Use Cases:

Agent: Document Analyzer
Model: gemini-1.5-pro
Tasks:
  - Analyze entire repository
  - Process 500-page documents
  - Video transcript analysis
  - Cross-document synthesis

Gemini 1.5 Flash

Fast and efficient Gemini model.

Property	Value
Model ID	gemini-1.5-flash
Context Window	1,000,000 tokens
Max Output	8,192 tokens
Input Cost	$0.075 / 1M tokens
Output Cost	$0.30 / 1M tokens
Multimodal	Yes (text, vision, audio, video)

Strengths:

Extremely cost-effective
Large context window
Fast responses
Multimodal capabilities

Best For:

Cost-sensitive high-volume tasks
Long document processing on a budget
Quick multimodal analysis
Background processing tasks

Comprehensive Cost Comparison

Cost Per 1M Tokens (USD)

Model	Input	Output	Blended*
Gemini 1.5 Flash	$0.075	$0.30	$0.19
GPT-4o Mini	$0.15	$0.60	$0.38
Claude 3 Haiku	$0.25	$1.25	$0.75
GPT-3.5 Turbo	$0.50	$1.50	$1.00
Gemini 1.5 Pro	$1.25	$5.00	$3.13
GPT-4o	$2.50	$10.00	$6.25
Claude Sonnet 4	$3.00	$15.00	$9.00
Claude 3.5 Sonnet	$3.00	$15.00	$9.00
GPT-4 Turbo	$10.00	$30.00	$20.00
Claude 3 Opus	$15.00	$75.00	$45.00

*Blended assumes 50/50 input/output ratio

Monthly Cost Estimates

Based on typical usage patterns:

Usage Level	Gemini Flash	GPT-4o Mini	Claude Haiku	GPT-4o	Claude Sonnet
Light (1M tokens)	$0.19	$0.38	$0.75	$6.25	$9.00
Medium (10M tokens)	$1.88	$3.75	$7.50	$62.50	$90.00
Heavy (100M tokens)	$18.75	$37.50	$75.00	$625	$900
Enterprise (1B tokens)	$188	$375	$750	$6,250	$9,000

Model Comparison Matrix

By Capability

Capability	Best Models	Notes
Complex Reasoning	Claude Sonnet 4, GPT-4o	Similar performance
Code Generation	Claude 3.5 Sonnet, GPT-4o	Claude slightly better for complex code
Creative Writing	Claude Sonnet 4, GPT-4o	Claude often preferred
Long Context	Gemini 1.5 Pro	8x more context than alternatives
Speed	Gemini Flash, Claude Haiku	Sub-second responses
Cost Efficiency	Gemini Flash, GPT-4o Mini	Best value per token
Vision	GPT-4o, Claude Sonnet 4	Both excellent
Instruction Following	Claude Sonnet 4, GPT-4o	Claude slightly more precise

By Use Case

Use Case	Primary Choice	Budget Alternative
Customer Support	GPT-4o Mini	Claude Haiku
Code Review	Claude 3.5 Sonnet	GPT-4o
Content Creation	Claude Sonnet 4	GPT-4o
Data Analysis	GPT-4o	Gemini 1.5 Pro
Document Processing	Gemini 1.5 Pro	Gemini 1.5 Flash
Research	Claude Sonnet 4	Claude 3.5 Sonnet
Quick Q&A	Claude Haiku	GPT-4o Mini
Image Analysis	GPT-4o	Claude Sonnet 4

Model Selection Guide

Decision Flowchart

START
                          │
                          ▼
              ┌───────────────────────┐
              │ Processing long       │
              │ documents (>100K      │──Yes──▶ Gemini 1.5 Pro
              │ tokens)?              │
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Cost is primary       │
              │ concern?              │──Yes──▶ Gemini Flash / GPT-4o Mini
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Need fastest          │
              │ possible response?    │──Yes──▶ Claude Haiku / Gemini Flash
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Complex reasoning     │
              │ or writing?           │──Yes──▶ Claude Sonnet 4 / GPT-4o
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Code generation       │
              │ or review?            │──Yes──▶ Claude 3.5 Sonnet / GPT-4o
              └───────────────────────┘
                          │No
                          ▼
                   GPT-4o Mini
              (Good general default)

Recommended Model by Agent Role

Agent Role	Recommended Model	Reasoning
Research Analyst	Claude Sonnet 4	Best for synthesis and analysis
Code Developer	Claude 3.5 Sonnet	Excellent code generation
Content Writer	Claude Sonnet 4	Superior writing quality
Data Processor	GPT-4o Mini	Cost-effective for volume
Customer Support	GPT-4o Mini	Fast, accurate responses
Creative Director	Claude Sonnet 4	Best creative output
Technical Lead	GPT-4o	Strong reasoning + vision
Document Analyst	Gemini 1.5 Pro	Handles long documents
Quick Responder	Claude Haiku	Fastest response times
Quality Reviewer	Claude Sonnet 4	Catches nuanced issues

Performance Benchmarks

Response Time (Average)

Model	Time to First Token	Full Response (500 tokens)
Claude Haiku	~150ms	~1.2s
Gemini Flash	~200ms	~1.5s
GPT-4o Mini	~250ms	~2.0s
GPT-4o	~300ms	~2.5s
Claude Sonnet 4	~350ms	~3.0s
Gemini 1.5 Pro	~400ms	~3.5s
Claude 3 Opus	~500ms	~5.0s

Tokens Per Second (Output)

Model	Tokens/Second
Claude Haiku	~400
Gemini Flash	~350
GPT-4o Mini	~250
GPT-4o	~200
Claude Sonnet 4	~170
Gemini 1.5 Pro	~150
Claude 3 Opus	~100

Best Practices

1. Use the Right Model for the Job

Don't use expensive models for simple tasks:

Swarm: Content Pipeline
Agents:
  - name: Classifier
    model: claude-3-haiku-20240307   # Simple classification

  - name: Writer
    model: claude-sonnet-4-20250514   # Quality matters here

  - name: Formatter
    model: gpt-4o-mini                # Simple formatting

2. Consider Context Requirements

Match context window to your needs:

Task: Analyze 10-page document
  └─ Any model works fine

Task: Analyze 500-page document
  └─ Use Gemini 1.5 Pro (only option)

Task: Analyze codebase with 50+ files
  └─ Use Gemini 1.5 Pro for full context
  └─ Or use Claude/GPT with chunking

3. Optimize for Cost

Use tiered approach based on task complexity:

Tier 1 (Simple): Claude Haiku / Gemini Flash
  - Classification
  - Routing
  - Simple extraction

Tier 2 (Standard): GPT-4o Mini
  - General queries
  - Data transformation
  - Standard responses

Tier 3 (Complex): GPT-4o / Claude Sonnet 4
  - Complex reasoning
  - High-quality output
  - Multi-step tasks

[API Keys](/docs/models/api-keys): Configure API keys for each provider
[Local Models](/docs/models/local-models): Set up Ollama and local models
[Model Configuration](/docs/models/model-configuration): Fine-tune model parameters
[Usage & Costs](/docs/models/usage-tracking): Monitor your usage and spending

Supported Models

Supported Models

Provider Overview

OpenAI Models

GPT-4o (Recommended)

GPT-4o Mini

GPT-4 Turbo

GPT-3.5 Turbo

Anthropic Models

Claude Sonnet 4 (claude-sonnet-4-20250514)

Claude 3.5 Sonnet

Claude 3 Opus

Claude 3 Haiku

Google AI Models

Gemini 1.5 Pro

Gemini 1.5 Flash

Comprehensive Cost Comparison

Cost Per 1M Tokens (USD)

Monthly Cost Estimates

Model Comparison Matrix

By Capability

By Use Case

Model Selection Guide

Decision Flowchart

Recommended Model by Agent Role

Performance Benchmarks

Response Time (Average)

Tokens Per Second (Output)

Best Practices

1. Use the Right Model for the Job

2. Consider Context Requirements

3. Optimize for Cost

Related Documentation

Cookie Preferences