Models (41)
Featured Models
dos-auto
Smart Router - automatically selects the best model for each request using a 15-dimension classifier.
DOS AI (Qwen 3.5 35B-A3B)
Self-hosted MoE model with 35B total / 3B active params. Our default model - best balance of quality, speed, and cost.
Llama 4 Maverick
Meta's most capable open model — 400B MoE with 17B active parameters and multimodal support.
Llama 4 Scout
Efficient Llama 4 variant — 109B MoE with 17B active. Industry-leading 10M context window.
QwQ 32B
Reasoning model rivaling DeepSeek R1 — excels at math, logic, and complex problem-solving.
DeepSeek V3
State-of-the-art MoE model with exceptional reasoning capabilities.
All Models
dos-auto
Smart Router - automatically selects the best model for each request using a 15-dimension classifier.
DOS AI (Qwen 3.5 35B-A3B)
Self-hosted MoE model with 35B total / 3B active params. Our default model - best balance of quality, speed, and cost.
Llama 4 Maverick
Meta's most capable open model — 400B MoE with 17B active parameters and multimodal support.
Llama 4 Scout
Efficient Llama 4 variant — 109B MoE with 17B active. Industry-leading 10M context window.
Gemma 3 27B
Google's latest — strong multilingual, multimodal, and reasoning with 128K context.
Gemma 3 12B
Efficient mid-size model with vision capabilities and strong benchmark scores.
QwQ 32B
Reasoning model rivaling DeepSeek R1 — excels at math, logic, and complex problem-solving.
Mistral Small 3.1
Latest efficient Mistral with vision support and 128K context. Great speed/quality tradeoff.
DeepSeek V3
State-of-the-art MoE model with exceptional reasoning capabilities.
DeepSeek R1
Reasoning-focused model trained with reinforcement learning for complex tasks.
DeepSeek R1 Distill 70B
Llama-based distillation of R1 reasoning — 90% of R1 quality at a fraction of the cost.
DeepSeek R1 Distill 32B
Qwen-based R1 distillation — strong reasoning in a compact, efficient package.
Llama 3.3 70B
High-performance multilingual LLM optimized for dialogue and instruction following.
Llama 3.1 405B
The largest Llama 3 model for complex reasoning and generation tasks.
Llama 3.1 70B
Balanced performance and efficiency for production workloads.
Llama 3.1 8B
Fast and cost-effective model for simpler tasks and high-volume applications.
Mistral Large 2
Flagship model with strong multilingual and coding capabilities.
Mixtral 8x22B
Sparse mixture-of-experts model balancing capability and efficiency.
Qwen 2.5 72B
Strong multilingual model with excellent Chinese and English performance.
Qwen 2.5 32B
Mid-size model with great balance of speed and capability.
Gemma 2 27B
Efficient model from Google with strong performance on diverse tasks.
Gemma 2 9B
Lightweight model ideal for on-device and edge deployments.
Qwen 2.5 Coder 32B
Top-performing open code model — rivals GPT-4 on coding benchmarks.
Qwen 2.5 Coder 7B
Fast and efficient code model for completions and generation.
Code Llama 70B
Specialized for code generation, completion, and understanding.
Code Llama 34B
Fast code generation with support for many programming languages.
DeepSeek Coder 33B
Top-performing code model trained on 2T tokens of code.
Qwen 2.5-VL 72B
Latest vision-language model with video understanding, OCR, and agentic capabilities.
Llama 4 Scout Vision
Multimodal Llama 4 variant with native image understanding and 10M context.
Llama 3.2 90B Vision
Multimodal model for image understanding and visual reasoning.
Llama 3.2 11B Vision
Efficient vision-language model for image analysis tasks.
Gemma 3 27B Vision
Google's multimodal model with image, video, and document understanding.
FLUX.1 Pro
State-of-the-art image generation with exceptional quality and prompt adherence.
FLUX.1 Dev
High-quality image generation for development and testing.
FLUX.1 Schnell
Fast image generation optimized for speed.
Stable Diffusion XL
Versatile image generation with fine-tuning support.
UAE Large V1
Universal embedding model with strong performance across benchmarks.
BGE Large EN
High-quality English embeddings for RAG and semantic search.
BGE Base EN
Fast and efficient embeddings for production use.
E5 Mistral 7B
Large embedding model with 4096 dimensions for high-fidelity retrieval.
Whisper Large V3
Industry-leading speech-to-text with multilingual support.
Ready to get started?
Start building with $10 in free credits. No credit card required.