Build AI Apps. Faster.

Access powerful AI models through a simple API. Low latency inference, smart model routing, and enterprise-grade infrastructure.

Get started free View documentation

Trusted by

Why use DOS?

Build faster, scale easier, and focus on what matters - your product.

Industry-leading inference speeds with optimized model serving. Get responses in milliseconds, not seconds.

Pay only for what you use with transparent pricing. No hidden fees, no minimum commitments.

Simple REST APIs with SDKs for Python, Node.js, and more. Get started in minutes with our comprehensive docs.

SOC2 compliant, GDPR ready, with 99.9% uptime SLA. Dedicated support and custom deployment options.

Industry-leading inference speeds with optimized model serving. Get responses in milliseconds, not seconds.

Live Inference Metrics

Time to First Token

12ms

Throughput

847tok/s

P99 Latency

0.8sec

Request Latency (last 60s)

67% faster than leading alternatives

Run the best AI models with a single line of code

Access leading open-source and proprietary models for chat, images, code, and more - with smart routing built in.

Model Library

Access open-source and proprietary models for chat, images, video, code, embeddings, and audio.

Drop-in replacement for OpenAI - just change the base URL.

Start building now

CHATCODEVISION

NEW

What can you build on DOS?

From chatbots to autonomous agents, DOS powers the next generation of AI applications.

Conversational AI

Build intelligent chatbots and virtual assistants.

Create AI-powered chat experiences with multi-turn conversations, context awareness, and natural language understanding.

Support Agent

Online

Hi! How can I help you today?

I need to reset my API key

I've revoked your old key and generated a new one:

dos_sk_live_7f3a...x9k2

Copied to clipboard. The old key is now inactive.

Thanks! Can you also check my usage this month?

Here's your usage summary:

API Calls12,847

Tokens2.4M

Cost$8.42

Type a message...

Code Assistant

Accelerate development with AI-powered coding.

Generate code, detect bugs, review pull requests, and create documentation automatically with state-of-the-art code models.

api-handler.ts

utils.py

123456789101112131415161718

import { streamText } from 'ai'

import { createDOS } from '@dos/sdk'

// AI-generated: streaming chat endpoint

const dos = createDOS()

export async function POST(req) {

const { messages } = await req.json()

const result = streamText({

model: dos('dos-ai'),

messages,

temperature: 0.7,

})

return result.toDataStreamResponse()

}

// Handle errors and rate limiting

AI: Add error handling with retry logic?Tab to accept

RAG & Search

Build powerful semantic search experiences.

Create knowledge bases with retrieval-augmented generation for accurate, contextual answers with citation support.

How do I configure rate limiting?

AI Answer3 sources

To configure rate limiting, add the rateLimit option to your API configuration:

const config = {
  rateLimit: {
    requests: 100,
    window: '1m'
  }
}

This limits each API key to 100 requests per minute. You can also set per-endpoint limits.

Sources

Rate Limiting Guide

docs/guides/rate-limiting.md

97%

API Configuration Reference

docs/api/config.md

89%

Security Best Practices

docs/guides/security.md

82%

AI Agents

Deploy autonomous AI agents on any channel.

Launch personal assistants, sales bots, and customer service agents on Telegram, WhatsApp, and more - with built-in memory, skills, and tool use.

Sales Assistant

Active on 3 channels

Running

Web Chat

Zalo

Conversations

1,247

+12%

Resolved

94.2%

+3%

Avg Response

1.2s

-0.3s

Recent Conversations

Minh T.

2m ago

I want to order 50 units of SKU-4421

Active

Sarah L.

5m ago

Do you ship internationally?

Resolved

Nguyen H.

8m ago

Can I get a bulk discount?

Resolved

Skills:Product CatalogOrder TrackingDiscount RulesFAQ

Support Agent

Online

Hi! How can I help you today?

I need to reset my API key

I've revoked your old key and generated a new one:

dos_sk_live_7f3a...x9k2

Copied to clipboard. The old key is now inactive.

Thanks! Can you also check my usage this month?

Here's your usage summary:

API Calls12,847

Tokens2.4M

Cost$8.42

Type a message...

api-handler.ts

utils.py

123456789101112131415161718

import { streamText } from 'ai'

import { createDOS } from '@dos/sdk'

// AI-generated: streaming chat endpoint

const dos = createDOS()

export async function POST(req) {

const { messages } = await req.json()

const result = streamText({

model: dos('dos-ai'),

messages,

temperature: 0.7,

})

return result.toDataStreamResponse()

}

// Handle errors and rate limiting

AI: Add error handling with retry logic?Tab to accept

How do I configure rate limiting?

AI Answer3 sources

To configure rate limiting, add the rateLimit option to your API configuration:

const config = {
  rateLimit: {
    requests: 100,
    window: '1m'
  }
}

This limits each API key to 100 requests per minute. You can also set per-endpoint limits.

Sources

Rate Limiting Guide

docs/guides/rate-limiting.md

97%

API Configuration Reference

docs/api/config.md

89%

Security Best Practices

docs/guides/security.md

82%

Get started today

Start building with DOS in minutes. No credit card required. Get $10 free credits to explore our API.

Start building for free

Loved by developers worldwide.

Teams of all sizes trust DOS to power their AI applications. Here’s what they have to say.

- DOS has completely transformed how we build AI features. What used to take weeks now takes hours. The inference speed is incredible.
  Sarah Chen
  CTO at TechFlow
- We switched from OpenAI to DOS and cut our AI costs by 60%. The API is drop-in compatible, so migration was seamless.
  Michael Torres
  Lead Engineer at DataPipe
- Having every top model behind one API key is a game-changer. We route simple calls to a low-cost model and hard ones to frontier models, all with a single line of code.
  James Wilson
  Founder of LegalAI
- DOS support team is exceptional. They helped us optimize our prompts and reduced latency by 40%. Best developer experience in the industry.
  Emily Zhang
  VP Engineering at Nexus
- We built our entire RAG pipeline on DOS. The embeddings API is fast, accurate, and the pricing makes it viable at scale.
  David Park
  Founder of SearchBot
- Enterprise-grade reliability with startup-friendly pricing. DOS powers our chatbot serving 100k+ users daily without breaking a sweat.
  Lisa Wang
  Head of AI at CloudScale

Frequently asked questions

Can’t find what you’re looking for? Reach out to our support team at support@dos.ai and we’ll get back to you within 24 hours.

- What models does DOS support?
  DOS provides access to leading open-source and proprietary models including Llama 4, DeepSeek, Qwen, Gemma, Mistral, and more. Use dos-auto to let our smart router pick the best model for each request automatically.
- How does pricing work?
  We use simple pay-as-you-go pricing based on tokens processed. No hidden fees, no minimum commitments. You only pay for what you use, and we offer volume discounts for high-usage customers.
- Is there a free tier?
  Yes! New users get $10 in free credits to explore our API. This is enough to make thousands of API calls and test our platform thoroughly before committing.
- How fast is the inference?
  Our optimized infrastructure delivers industry-leading inference speeds. Most requests complete in under 100ms for the first token, with streaming support for real-time applications.
- Which AI models can I access?
  You get a wide range of models through one API - our own low-cost DOS.AI model plus frontier models from OpenAI, Anthropic, Google, DeepSeek, Qwen, and more. Use dos-auto to automatically route each request to the best model for the job.
- Is my data secure?
  Security is our top priority. We are SOC2 Type II certified, GDPR compliant, and encrypt all data in transit and at rest. We never train on your data or share it with third parties.
- Do you offer enterprise plans?
  Yes, we offer enterprise plans with dedicated resources, custom SLAs, priority support, and volume discounts. Contact our sales team to discuss your requirements.
- What SDKs do you provide?
  We offer official SDKs for Python, Node.js, Go, and Rust. Our REST API is also compatible with OpenAI client libraries, making migration seamless.
- How do I get support?
  All users have access to our documentation and community Discord. Paid plans include email support, and enterprise customers get dedicated Slack channels and account managers.

Build AI Apps. Faster.

Access powerful AI models through a simple API. Low latency inference, smart model routing, and enterprise-grade infrastructure.

Get started free View documentation

Trusted by

Why use DOS?

Build faster, scale easier, and focus on what matters - your product.

Industry-leading inference speeds with optimized model serving. Get responses in milliseconds, not seconds.

Pay only for what you use with transparent pricing. No hidden fees, no minimum commitments.

Simple REST APIs with SDKs for Python, Node.js, and more. Get started in minutes with our comprehensive docs.

SOC2 compliant, GDPR ready, with 99.9% uptime SLA. Dedicated support and custom deployment options.

Industry-leading inference speeds with optimized model serving. Get responses in milliseconds, not seconds.

Live Inference Metrics

Time to First Token

12ms

Throughput

847tok/s

P99 Latency

0.8sec

Request Latency (last 60s)

67% faster than leading alternatives

Run the best AI models with a single line of code

Access leading open-source and proprietary models for chat, images, code, and more - with smart routing built in.

Model Library

Access open-source and proprietary models for chat, images, video, code, embeddings, and audio.

Drop-in replacement for OpenAI - just change the base URL.

Start building now

CHATCODEVISION

NEW

DOS.AI Auto

Smart Router|128K ctx

Auto-priced

CHATCODE

NEW

DOS.AI

35B MoE (3B active)|128K ctx

$0.07 / 1M tokens

CHATCODE

NEW

DeepSeek V4 Flash

MoE (fast tier)|1M ctx

$0.15 / 1M tokens

CHATVISION

NEW

Gemini 3.1 Pro

GPT-5.5

Claude Opus 4.8

Opus|1M ctx

$5.25 / 1M tokens

What can you build on DOS?

From chatbots to autonomous agents, DOS powers the next generation of AI applications.

Conversational AI

Build intelligent chatbots and virtual assistants.

Create AI-powered chat experiences with multi-turn conversations, context awareness, and natural language understanding.

Support Agent

Online

Hi! How can I help you today?

I need to reset my API key

I've revoked your old key and generated a new one:

dos_sk_live_7f3a...x9k2

Copied to clipboard. The old key is now inactive.

Thanks! Can you also check my usage this month?

Here's your usage summary:

API Calls12,847

Tokens2.4M

Cost$8.42

Type a message...

Code Assistant

Accelerate development with AI-powered coding.

Generate code, detect bugs, review pull requests, and create documentation automatically with state-of-the-art code models.

api-handler.ts

utils.py

123456789101112131415161718

import { streamText } from 'ai'

import { createDOS } from '@dos/sdk'

// AI-generated: streaming chat endpoint

const dos = createDOS()

export async function POST(req) {

const { messages } = await req.json()

const result = streamText({

model: dos('dos-ai'),

messages,

temperature: 0.7,

})

return result.toDataStreamResponse()

}

// Handle errors and rate limiting

AI: Add error handling with retry logic?Tab to accept

RAG & Search

Build powerful semantic search experiences.

Create knowledge bases with retrieval-augmented generation for accurate, contextual answers with citation support.

How do I configure rate limiting?

AI Answer3 sources

To configure rate limiting, add the rateLimit option to your API configuration:

const config = {
  rateLimit: {
    requests: 100,
    window: '1m'
  }
}

This limits each API key to 100 requests per minute. You can also set per-endpoint limits.

Sources

Rate Limiting Guide

docs/guides/rate-limiting.md

97%

API Configuration Reference

docs/api/config.md

89%

Security Best Practices

docs/guides/security.md

82%

AI Agents

Deploy autonomous AI agents on any channel.

Launch personal assistants, sales bots, and customer service agents on Telegram, WhatsApp, and more - with built-in memory, skills, and tool use.

Sales Assistant

Active on 3 channels

Running

Web Chat

Zalo

Conversations

1,247

+12%

Resolved

94.2%

+3%

Avg Response

1.2s

-0.3s

Recent Conversations

Minh T.

2m ago

I want to order 50 units of SKU-4421

Active

Sarah L.

5m ago

Do you ship internationally?

Resolved

Nguyen H.

8m ago

Can I get a bulk discount?

Resolved

Skills:Product CatalogOrder TrackingDiscount RulesFAQ

Support Agent

Online

Hi! How can I help you today?

I need to reset my API key

I've revoked your old key and generated a new one:

dos_sk_live_7f3a...x9k2

Copied to clipboard. The old key is now inactive.

Thanks! Can you also check my usage this month?

Here's your usage summary:

API Calls12,847

Tokens2.4M

Cost$8.42

Type a message...

api-handler.ts

utils.py

123456789101112131415161718

import { streamText } from 'ai'

import { createDOS } from '@dos/sdk'

// AI-generated: streaming chat endpoint

const dos = createDOS()

export async function POST(req) {

const { messages } = await req.json()

const result = streamText({

model: dos('dos-ai'),

messages,

temperature: 0.7,

})

return result.toDataStreamResponse()

}

// Handle errors and rate limiting

AI: Add error handling with retry logic?Tab to accept

How do I configure rate limiting?

AI Answer3 sources

To configure rate limiting, add the rateLimit option to your API configuration:

const config = {
  rateLimit: {
    requests: 100,
    window: '1m'
  }
}

This limits each API key to 100 requests per minute. You can also set per-endpoint limits.

Sources

Rate Limiting Guide

docs/guides/rate-limiting.md

97%

API Configuration Reference

docs/api/config.md

89%

Security Best Practices

docs/guides/security.md

82%

Get started today

Start building with DOS in minutes. No credit card required. Get $10 free credits to explore our API.

Start building for free

Loved by developers worldwide.

Teams of all sizes trust DOS to power their AI applications. Here’s what they have to say.

- DOS has completely transformed how we build AI features. What used to take weeks now takes hours. The inference speed is incredible.
  Sarah Chen
  CTO at TechFlow
- We switched from OpenAI to DOS and cut our AI costs by 60%. The API is drop-in compatible, so migration was seamless.
  Michael Torres
  Lead Engineer at DataPipe
- Having every top model behind one API key is a game-changer. We route simple calls to a low-cost model and hard ones to frontier models, all with a single line of code.
  James Wilson
  Founder of LegalAI
- DOS support team is exceptional. They helped us optimize our prompts and reduced latency by 40%. Best developer experience in the industry.
  Emily Zhang
  VP Engineering at Nexus
- We built our entire RAG pipeline on DOS. The embeddings API is fast, accurate, and the pricing makes it viable at scale.
  David Park
  Founder of SearchBot
- Enterprise-grade reliability with startup-friendly pricing. DOS powers our chatbot serving 100k+ users daily without breaking a sweat.
  Lisa Wang
  Head of AI at CloudScale

Frequently asked questions

Can’t find what you’re looking for? Reach out to our support team at support@dos.ai and we’ll get back to you within 24 hours.

- What models does DOS support?
  DOS provides access to leading open-source and proprietary models including Llama 4, DeepSeek, Qwen, Gemma, Mistral, and more. Use dos-auto to let our smart router pick the best model for each request automatically.
- How does pricing work?
  We use simple pay-as-you-go pricing based on tokens processed. No hidden fees, no minimum commitments. You only pay for what you use, and we offer volume discounts for high-usage customers.
- Is there a free tier?
  Yes! New users get $10 in free credits to explore our API. This is enough to make thousands of API calls and test our platform thoroughly before committing.
- How fast is the inference?
  Our optimized infrastructure delivers industry-leading inference speeds. Most requests complete in under 100ms for the first token, with streaming support for real-time applications.
- Which AI models can I access?
  You get a wide range of models through one API - our own low-cost DOS.AI model plus frontier models from OpenAI, Anthropic, Google, DeepSeek, Qwen, and more. Use dos-auto to automatically route each request to the best model for the job.
- Is my data secure?
  Security is our top priority. We are SOC2 Type II certified, GDPR compliant, and encrypt all data in transit and at rest. We never train on your data or share it with third parties.
- Do you offer enterprise plans?
  Yes, we offer enterprise plans with dedicated resources, custom SLAs, priority support, and volume discounts. Contact our sales team to discuss your requirements.
- What SDKs do you provide?
  We offer official SDKs for Python, Node.js, Go, and Rust. Our REST API is also compatible with OpenAI client libraries, making migration seamless.
- How do I get support?
  All users have access to our documentation and community Discord. Paid plans include email support, and enterprise customers get dedicated Slack channels and account managers.

Build AI Apps. Faster.

Why use DOS?

Blazing Fast

Cost Effective

Easy to Use

Enterprise Ready

Run the best AI models with a single line of code

Model Library

DOS.AI Auto

DOS.AI

DeepSeek V4 Flash

Gemini 3.1 Pro

GPT-5.5

Claude Opus 4.8

What can you build on DOS?

Conversational AI

Code Assistant

RAG & Search

AI Agents

Conversational AI

Code Assistant

RAG & Search

AI Agents

Get started today

Loved by developers worldwide.

Frequently asked questions

What models does DOS support?

How does pricing work?

Is there a free tier?

How fast is the inference?

Which AI models can I access?

Is my data secure?

Do you offer enterprise plans?

What SDKs do you provide?

How do I get support?

Build AI Apps. Faster.

Why use DOS?

Blazing Fast

Cost Effective

Easy to Use

Enterprise Ready

Run the best AI models with a single line of code

Model Library

DOS.AI Auto

DOS.AI

DeepSeek V4 Flash

Gemini 3.1 Pro

GPT-5.5

Claude Opus 4.8

What can you build on DOS?

Conversational AI

Code Assistant

RAG & Search

AI Agents

Conversational AI

Code Assistant

RAG & Search

AI Agents

Get started today

Loved by developers worldwide.

Frequently asked questions

What models does DOS support?

How does pricing work?

Is there a free tier?

How fast is the inference?

Which AI models can I access?

Is my data secure?

Do you offer enterprise plans?

What SDKs do you provide?

How do I get support?