HIVE

Model Management

Supported Models

Supported Models

HIVE Protocol integrates with the leading AI model providers, giving you access to the most capable language models available. This guide covers all supported models, their capabilities, pricing, and recommendations for different use cases.

Provider Overview

┌─────────────────────────────────────────────────────────────────┐
│                    Supported AI Providers                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │   OpenAI    │  │  Anthropic  │  │  Google AI  │              │
│  │   ───────   │  │  ─────────  │  │  ─────────  │              │
│  │  GPT-4o     │  │  Claude 4   │  │  Gemini 1.5 │              │
│  │  GPT-4o Mini│  │  Claude 3.5 │  │  Pro/Flash  │              │
│  │  GPT-4 Turbo│  │  Claude 3   │  │             │              │
│  │  GPT-3.5    │  │  Opus/Haiku │  │             │              │
│  └─────────────┘  └─────────────┘  └─────────────┘              │
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐                               │
│  │   Ollama    │  │   Custom    │                               │
│  │   ───────   │  │   ───────   │                               │
│  │  Llama 3    │  │  Any OpenAI │                               │
│  │  Mistral    │  │  compatible │                               │
│  │  Qwen       │  │  endpoint   │                               │
│  └─────────────┘  └─────────────┘                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

OpenAI Models

OpenAI provides the GPT family of models, known for their versatility and strong performance across diverse tasks.

The flagship OpenAI model with multimodal capabilities including vision and audio.

PropertyValue
Model IDgpt-4o
Context Window128,000 tokens
Max Output16,384 tokens
Input Cost$2.50 / 1M tokens
Output Cost$10.00 / 1M tokens
Knowledge CutoffOctober 2023
MultimodalYes (text, vision, audio)

Strengths:

  • Excellent reasoning and analysis
  • Strong code generation
  • Vision capabilities for image analysis
  • Fast response times for its capability level

Best For:

  • Complex multi-step reasoning
  • Code generation and review
  • Document analysis with images
  • Production applications requiring high quality

Example Use Cases:

Agent: Technical Architect
Model: gpt-4o
Tasks:
  - System design discussions
  - Code review with context
  - Technical documentation
  - Architecture diagrams analysis

GPT-4o Mini

A smaller, faster, and more affordable version of GPT-4o.

PropertyValue
Model IDgpt-4o-mini
Context Window128,000 tokens
Max Output16,384 tokens
Input Cost$0.15 / 1M tokens
Output Cost$0.60 / 1M tokens
Knowledge CutoffOctober 2023
MultimodalYes (text, vision)

Strengths:

  • 93% cost reduction vs GPT-4o
  • Very fast responses
  • Still maintains strong reasoning
  • Large context window

Best For:

  • High-volume tasks
  • Quick responses where top-tier quality isn't critical
  • Cost-sensitive applications
  • Preprocessing and filtering

Example Use Cases:

Agent: Quick Responder
Model: gpt-4o-mini
Tasks:
  - Initial customer queries
  - Data extraction
  - Simple transformations
  - Classification tasks

GPT-4 Turbo

High-performance model with vision capabilities.

PropertyValue
Model IDgpt-4-turbo
Context Window128,000 tokens
Max Output4,096 tokens
Input Cost$10.00 / 1M tokens
Output Cost$30.00 / 1M tokens
Knowledge CutoffDecember 2023
MultimodalYes (text, vision)

Note: GPT-4o is generally recommended over GPT-4 Turbo for most use cases due to better price-performance ratio.

GPT-3.5 Turbo

Fast and economical for simpler tasks.

PropertyValue
Model IDgpt-3.5-turbo
Context Window16,385 tokens
Max Output4,096 tokens
Input Cost$0.50 / 1M tokens
Output Cost$1.50 / 1M tokens
Knowledge CutoffSeptember 2021
MultimodalNo

Best For:

  • Simple queries and chat
  • Legacy applications
  • Very high volume, cost-sensitive tasks

Anthropic Models

Anthropic's Claude models are known for their strong reasoning, safety features, and excellent writing quality.

Claude Sonnet 4 (claude-sonnet-4-20250514)

The latest and most capable Sonnet model with exceptional reasoning.

PropertyValue
Model IDclaude-sonnet-4-20250514
Context Window200,000 tokens
Max Output8,192 tokens
Input Cost$3.00 / 1M tokens
Output Cost$15.00 / 1M tokens
Knowledge CutoffApril 2024
MultimodalYes (text, vision)

Strengths:

  • Exceptional reasoning and analysis
  • Excellent writing quality
  • Strong at following complex instructions
  • Best-in-class for nuanced tasks

Best For:

  • Complex analysis and research
  • High-quality content creation
  • Tasks requiring careful reasoning
  • Multi-step problem solving

Example Use Cases:

Agent: Research Analyst
Model: claude-sonnet-4-20250514
Tasks:
  - Market research synthesis
  - Competitive analysis
  - Strategic recommendations
  - Report writing

Claude 3.5 Sonnet

Balanced performance with strong coding capabilities.

PropertyValue
Model IDclaude-3-5-sonnet-20241022
Context Window200,000 tokens
Max Output8,192 tokens
Input Cost$3.00 / 1M tokens
Output Cost$15.00 / 1M tokens
Knowledge CutoffApril 2024
MultimodalYes (text, vision)

Strengths:

  • Excellent at coding tasks
  • Strong general-purpose capabilities
  • Good balance of speed and quality
  • Reliable instruction following

Best For:

  • Code generation and review
  • General-purpose tasks
  • Technical documentation
  • Data analysis

Claude 3 Opus

Highest capability model in the Claude 3 family.

PropertyValue
Model IDclaude-3-opus-20240229
Context Window200,000 tokens
Max Output4,096 tokens
Input Cost$15.00 / 1M tokens
Output Cost$75.00 / 1M tokens
Knowledge CutoffAugust 2023
MultimodalYes (text, vision)

Strengths:

  • Highest quality outputs
  • Exceptional at complex reasoning
  • Best for creative tasks
  • Strong at nuanced understanding

Best For:

  • Tasks requiring absolute best quality
  • Complex creative projects
  • High-stakes content
  • Research and analysis

Note: Consider Claude Sonnet 4 as a more cost-effective alternative with comparable quality.

Claude 3 Haiku

Fast and efficient for simple tasks.

PropertyValue
Model IDclaude-3-haiku-20240307
Context Window200,000 tokens
Max Output4,096 tokens
Input Cost$0.25 / 1M tokens
Output Cost$1.25 / 1M tokens
Knowledge CutoffAugust 2023
MultimodalYes (text, vision)

Strengths:

  • Very fast responses
  • Low cost
  • Large context window
  • Still capable for many tasks

Best For:

  • Quick responses
  • High-volume processing
  • Classification tasks
  • Simple queries

Example Use Cases:

Agent: Triage Bot
Model: claude-3-haiku-20240307
Tasks:
  - Initial message classification
  - Quick fact lookup
  - Simple data extraction
  - Routing decisions

Google AI Models

Google's Gemini models offer massive context windows, making them ideal for processing long documents.

Gemini 1.5 Pro

Advanced model with the largest context window available.

PropertyValue
Model IDgemini-1.5-pro
Context Window1,000,000 tokens
Max Output8,192 tokens
Input Cost$1.25 / 1M tokens
Output Cost$5.00 / 1M tokens
MultimodalYes (text, vision, audio, video)

Strengths:

  • Massive 1M token context
  • Can process entire codebases
  • Video and audio analysis
  • Strong long-form reasoning

Best For:

  • Processing entire books or codebases
  • Long document analysis
  • Video/audio content analysis
  • Tasks requiring extensive context

Example Use Cases:

Agent: Document Analyzer
Model: gemini-1.5-pro
Tasks:
  - Analyze entire repository
  - Process 500-page documents
  - Video transcript analysis
  - Cross-document synthesis

Gemini 1.5 Flash

Fast and efficient Gemini model.

PropertyValue
Model IDgemini-1.5-flash
Context Window1,000,000 tokens
Max Output8,192 tokens
Input Cost$0.075 / 1M tokens
Output Cost$0.30 / 1M tokens
MultimodalYes (text, vision, audio, video)

Strengths:

  • Extremely cost-effective
  • Large context window
  • Fast responses
  • Multimodal capabilities

Best For:

  • Cost-sensitive high-volume tasks
  • Long document processing on a budget
  • Quick multimodal analysis
  • Background processing tasks

Comprehensive Cost Comparison

Cost Per 1M Tokens (USD)

ModelInputOutputBlended*
Gemini 1.5 Flash$0.075$0.30$0.19
GPT-4o Mini$0.15$0.60$0.38
Claude 3 Haiku$0.25$1.25$0.75
GPT-3.5 Turbo$0.50$1.50$1.00
Gemini 1.5 Pro$1.25$5.00$3.13
GPT-4o$2.50$10.00$6.25
Claude Sonnet 4$3.00$15.00$9.00
Claude 3.5 Sonnet$3.00$15.00$9.00
GPT-4 Turbo$10.00$30.00$20.00
Claude 3 Opus$15.00$75.00$45.00

*Blended assumes 50/50 input/output ratio

Monthly Cost Estimates

Based on typical usage patterns:

Usage LevelGemini FlashGPT-4o MiniClaude HaikuGPT-4oClaude Sonnet
Light (1M tokens)$0.19$0.38$0.75$6.25$9.00
Medium (10M tokens)$1.88$3.75$7.50$62.50$90.00
Heavy (100M tokens)$18.75$37.50$75.00$625$900
Enterprise (1B tokens)$188$375$750$6,250$9,000

Model Comparison Matrix

By Capability

CapabilityBest ModelsNotes
Complex ReasoningClaude Sonnet 4, GPT-4oSimilar performance
Code GenerationClaude 3.5 Sonnet, GPT-4oClaude slightly better for complex code
Creative WritingClaude Sonnet 4, GPT-4oClaude often preferred
Long ContextGemini 1.5 Pro8x more context than alternatives
SpeedGemini Flash, Claude HaikuSub-second responses
Cost EfficiencyGemini Flash, GPT-4o MiniBest value per token
VisionGPT-4o, Claude Sonnet 4Both excellent
Instruction FollowingClaude Sonnet 4, GPT-4oClaude slightly more precise

By Use Case

Use CasePrimary ChoiceBudget Alternative
Customer SupportGPT-4o MiniClaude Haiku
Code ReviewClaude 3.5 SonnetGPT-4o
Content CreationClaude Sonnet 4GPT-4o
Data AnalysisGPT-4oGemini 1.5 Pro
Document ProcessingGemini 1.5 ProGemini 1.5 Flash
ResearchClaude Sonnet 4Claude 3.5 Sonnet
Quick Q&AClaude HaikuGPT-4o Mini
Image AnalysisGPT-4oClaude Sonnet 4

Model Selection Guide

Decision Flowchart

START
                          │
                          ▼
              ┌───────────────────────┐
              │ Processing long       │
              │ documents (>100K      │──Yes──▶ Gemini 1.5 Pro
              │ tokens)?              │
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Cost is primary       │
              │ concern?              │──Yes──▶ Gemini Flash / GPT-4o Mini
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Need fastest          │
              │ possible response?    │──Yes──▶ Claude Haiku / Gemini Flash
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Complex reasoning     │
              │ or writing?           │──Yes──▶ Claude Sonnet 4 / GPT-4o
              └───────────────────────┘
                          │No
                          ▼
              ┌───────────────────────┐
              │ Code generation       │
              │ or review?            │──Yes──▶ Claude 3.5 Sonnet / GPT-4o
              └───────────────────────┘
                          │No
                          ▼
                   GPT-4o Mini
              (Good general default)
Agent RoleRecommended ModelReasoning
Research AnalystClaude Sonnet 4Best for synthesis and analysis
Code DeveloperClaude 3.5 SonnetExcellent code generation
Content WriterClaude Sonnet 4Superior writing quality
Data ProcessorGPT-4o MiniCost-effective for volume
Customer SupportGPT-4o MiniFast, accurate responses
Creative DirectorClaude Sonnet 4Best creative output
Technical LeadGPT-4oStrong reasoning + vision
Document AnalystGemini 1.5 ProHandles long documents
Quick ResponderClaude HaikuFastest response times
Quality ReviewerClaude Sonnet 4Catches nuanced issues

Performance Benchmarks

Response Time (Average)

ModelTime to First TokenFull Response (500 tokens)
Claude Haiku~150ms~1.2s
Gemini Flash~200ms~1.5s
GPT-4o Mini~250ms~2.0s
GPT-4o~300ms~2.5s
Claude Sonnet 4~350ms~3.0s
Gemini 1.5 Pro~400ms~3.5s
Claude 3 Opus~500ms~5.0s

Tokens Per Second (Output)

ModelTokens/Second
Claude Haiku~400
Gemini Flash~350
GPT-4o Mini~250
GPT-4o~200
Claude Sonnet 4~170
Gemini 1.5 Pro~150
Claude 3 Opus~100

Best Practices

1. Use the Right Model for the Job

Don't use expensive models for simple tasks:

Swarm: Content Pipeline
Agents:
  - name: Classifier
    model: claude-3-haiku-20240307   # Simple classification

  - name: Writer
    model: claude-sonnet-4-20250514   # Quality matters here

  - name: Formatter
    model: gpt-4o-mini                # Simple formatting

2. Consider Context Requirements

Match context window to your needs:

Task: Analyze 10-page document
  └─ Any model works fine

Task: Analyze 500-page document
  └─ Use Gemini 1.5 Pro (only option)

Task: Analyze codebase with 50+ files
  └─ Use Gemini 1.5 Pro for full context
  └─ Or use Claude/GPT with chunking

3. Optimize for Cost

Use tiered approach based on task complexity:

Tier 1 (Simple): Claude Haiku / Gemini Flash
  - Classification
  - Routing
  - Simple extraction

Tier 2 (Standard): GPT-4o Mini
  - General queries
  - Data transformation
  - Standard responses

Tier 3 (Complex): GPT-4o / Claude Sonnet 4
  - Complex reasoning
  - High-quality output
  - Multi-step tasks
  • [API Keys](/docs/models/api-keys): Configure API keys for each provider
  • [Local Models](/docs/models/local-models): Set up Ollama and local models
  • [Model Configuration](/docs/models/model-configuration): Fine-tune model parameters
  • [Usage & Costs](/docs/models/usage-tracking): Monitor your usage and spending

Cookie Preferences

We use cookies to enhance your experience, analyze site traffic, and for marketing purposes. By clicking "Accept All", you consent to our use of cookies. Read our Privacy Policy for more information.