Managed models
Models available in Digio today
Assign a default model per agent or override per task. Usage is metered in Digio Tokens from your plan balance—the same wallet whether the agent calls Sonnet, GPT-4o, or Gemini Flash.
Anthropic Claude
-
Claude Opus 4.7
Flagship reasoning, long context, architecture and strategy work.
-
Claude Opus 4.6
Previous-generation Opus for stable, high-quality analysis.
-
Claude Sonnet 4.6
Daily driver—coding, writing, and multi-step agent loops.
-
Claude Sonnet 4.5 / 4
Fast Sonnet tiers with prompt caching on supported workloads.
-
Claude Haiku 4.5
Low-latency drafts, classification, and high-volume subtasks.
OpenAI
-
GPT-5.5 / GPT-5.4 / GPT-5.2
Latest GPT-5 family for general and agentic workloads.
-
GPT-4.1 & GPT-4o
Reliable multimodal chat and tool use for production agents.
-
GPT-4o mini
Cost-efficient routing for summaries and lightweight steps.
-
o3 / o3-pro / o3-mini / o4-mini
Reasoning-focused models for math, planning, and verification.
-
GPT-5.3 Codex & Codex mini
Code generation, refactors, and repo-aware agent skills.
Google Gemini
-
Gemini 2.5 Pro
Long-context research and structured extraction.
-
Gemini 2.5 Flash
High-throughput agent steps with competitive token rates.
-
Gemini 2.0 Flash
Ultra-fast passes for parsing, tagging, and batch jobs.
Open & specialist APIs
-
DeepSeek Chat & Reasoner
Strong value for chat and chain-of-thought style tasks.
-
Mistral Large
European-hosted option for multilingual agent teams.
-
Llama 3.3 70B
Open-weights class model via API—pairs well with private GPU.
-
Grok 3
Real-time oriented model for news and social monitoring agents.
-
Sonar Pro
Search-grounded answers for research agents.
-
Command R+
RAG-friendly enterprise chat and retrieval workflows.
Model list and token economics evolve with provider releases. Your workspace shows live options when you assign a model to an agent; Digio Tokens debit from the same balance as in
pricing.
-
1
Reserve GPU
Choose VRAM, region, and uptime (burst vs always-on). Storage for weights ships with the instance or mounts your bucket.
-
2
Deploy the stack
Start a serving image or SSH in, install CUDA drivers, and load checkpoints. Health checks confirm the model is ready.
-
3
Register endpoint
Add base URL, API key, and model id in workspace settings. Digio validates latency and token format before going live.
-
4
Assign to agents
Pick your private model as the default for selected agents; managed Claude/GPT models remain available side by side.
GPU rental is billed separately from Digio plan subscriptions. Contact us for capacity planning, SLAs, and migration from an existing inference cluster.