Inference API for Open-Source Models
Access the latest Llama, Qwen, DeepSeek, and Mistral models through a simple, OpenAI-compatible API. No infrastructure to manage.
Simple Integration
Works with any OpenAI SDK. Just change the base URL and API key.
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'dos_sk_your_api_key',
baseURL: 'https://api.dos.ai/v1',
});
const response = await client.chat.completions.create({
model: 'Qwen/Qwen3-VL-30B-A3B-Instruct',
messages: [
{ role: 'user', content: 'Explain quantum computing' }
],
max_tokens: 1000,
});
console.log(response.choices[0].message.content);Built for Production
Enterprise-grade infrastructure with all the features you need.
OpenAI-Compatible API
Drop-in replacement for OpenAI API. Use your existing code with just a URL change.
Low Latency
Optimized inference infrastructure with global edge caching for sub-100ms response times.
Auto-Scaling
Automatically scales to handle traffic spikes. Pay only for what you use.
Usage Analytics
Real-time dashboards to monitor token usage, costs, and API performance.
Streaming Responses
Server-sent events for real-time streaming. Build responsive chat interfaces.
Function Calling
Built-in tool use support for building AI agents and automated workflows.
Available Models
Access the latest open-source models from leading AI labs.
| Model | Provider | Type | Context | Pricing |
|---|---|---|---|---|
| Qwen3-VL-30B-A3B-Instruct | Alibaba | Vision-Language | 128K | $0.15 / 1M tokens |
| Llama 3.3 70B Instruct | Meta | Text | 128K | $0.20 / 1M tokens |
| Llama 3.1 8B Instruct | Meta | Text | 128K | $0.05 / 1M tokens |
| DeepSeek V3 | DeepSeek | Text | 128K | $0.25 / 1M tokens |
| Qwen 2.5 72B Instruct | Alibaba | Text | 128K | $0.18 / 1M tokens |
| Mixtral 8x7B Instruct | Mistral AI | Text | 32K | $0.10 / 1M tokens |
More models coming soon. View all models
What Can You Build?
From chatbots to AI agents, power any application with our API.
Chatbots & Assistants
Build conversational AI with context awareness and multi-turn dialogue capabilities.
Content Generation
Generate articles, marketing copy, product descriptions, and creative content at scale.
Code Assistance
Power code completion, review, and generation features in your developer tools.
Data Extraction
Extract structured data from documents, emails, and unstructured text sources.
Pay-As-You-Go Pricing
No monthly fees. Pay only for the tokens you use.
per 1M tokens (varies by model)
- No minimum commitment
- Pay only for tokens used
- Usage dashboard included
- Rate limits scale with usage
- $5 free credits to start
Start Building Today
Get your API key and make your first request in under a minute.