Inference API for Open-Source Models

Access the latest Llama, Qwen, DeepSeek, and Mistral models through a simple, OpenAI-compatible API. No infrastructure to manage.

Simple Integration

Works with any OpenAI SDK. Just change the base URL and API key.

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'dos_sk_your_api_key',
  baseURL: 'https://api.dos.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'Qwen/Qwen3-VL-30B-A3B-Instruct',
  messages: [
    { role: 'user', content: 'Explain quantum computing' }
  ],
  max_tokens: 1000,
});

console.log(response.choices[0].message.content);

Built for Production

Enterprise-grade infrastructure with all the features you need.

OpenAI-Compatible API

Drop-in replacement for OpenAI API. Use your existing code with just a URL change.

Low Latency

Optimized inference infrastructure with global edge caching for sub-100ms response times.

Auto-Scaling

Automatically scales to handle traffic spikes. Pay only for what you use.

Usage Analytics

Real-time dashboards to monitor token usage, costs, and API performance.

Streaming Responses

Server-sent events for real-time streaming. Build responsive chat interfaces.

Function Calling

Built-in tool use support for building AI agents and automated workflows.

Available Models

Access the latest open-source models from leading AI labs.

ModelProviderTypeContextPricing
Qwen3-VL-30B-A3B-InstructAlibabaVision-Language128K$0.15 / 1M tokens
Llama 3.3 70B InstructMetaText128K$0.20 / 1M tokens
Llama 3.1 8B InstructMetaText128K$0.05 / 1M tokens
DeepSeek V3DeepSeekText128K$0.25 / 1M tokens
Qwen 2.5 72B InstructAlibabaText128K$0.18 / 1M tokens
Mixtral 8x7B InstructMistral AIText32K$0.10 / 1M tokens

More models coming soon. View all models

What Can You Build?

From chatbots to AI agents, power any application with our API.

Chatbots & Assistants

Build conversational AI with context awareness and multi-turn dialogue capabilities.

Content Generation

Generate articles, marketing copy, product descriptions, and creative content at scale.

Code Assistance

Power code completion, review, and generation features in your developer tools.

Data Extraction

Extract structured data from documents, emails, and unstructured text sources.

Pay-As-You-Go Pricing

No monthly fees. Pay only for the tokens you use.

Simple Pricing
$0.05 - $0.25

per 1M tokens (varies by model)

  • No minimum commitment
  • Pay only for tokens used
  • Usage dashboard included
  • Rate limits scale with usage
  • $5 free credits to start
Get Started Free

Start Building Today

Get your API key and make your first request in under a minute.