API Documentation

AI Router is fully compatible with the OpenAI API format. Use any existing SDK or HTTP client — just change the base URL and API key.

Quick Start

Get up and running with AI Router in under 2 minutes. If you already use the OpenAI SDK, you only need to change two parameters.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.airouter.kz/api/v1",
    api_key="air_live_your_key_here"
)

response = client.chat.completions.create(
    model="openai/gpt-5.4",
    messages=[
        {"role": "user", "content": "Summarize the OpenAI Chat Completions schema in one sentence."}
    ]
)

print(response.choices[0].message.content)

JavaScript / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.airouter.kz/api/v1",
    apiKey: "air_live_your_key_here"
});

const response = await client.chat.completions.create({
    model: "anthropic/claude-opus-4-7",
    messages: [
        { role: "user", content: "Summarize the OpenAI Chat Completions schema in one sentence." }
    ]
});

console.log(response.choices[0].message.content);

cURL

curl https://api.airouter.kz/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "google/gemini-3-pro",
    "messages": [
      {"role": "user", "content": "Summarize the OpenAI Chat Completions schema in one sentence."}
    ]
  }'

Authentication

All API requests require an API key passed in the Authorization header as a Bearer token.

API key format

AI Router API keys use the prefix air_live_ followed by 43 base64url characters. Example:

air_live_abc123def456ghi789jkl012mno345pqr678stu

Header format

Authorization: Bearer air_live_your_key_here

Generate and manage API keys from your dashboard or via the API key management endpoints.

Base URL

$https://api.airouter.kz/api/v1

All endpoints are relative to this base URL. The API is fully compatible with OpenRouter and OpenAI endpoint paths.

SDKs & Libraries

AI Router works with any SDK that supports custom base URLs. No special SDK needed — use what you already have.

OpenAI Python SDK

pip install openai

Set base_url to https://api.airouter.kz/api/v1

OpenAI Node.js SDK

npm install openai

Set baseURL to https://api.airouter.kz/api/v1

LangChain

pip install langchain-openai

Use ChatOpenAI with the openai_api_base parameter

LlamaIndex

pip install llama-index-llms-openai

Set api_base on the OpenAI LLM class

Endpoints

AI Router implements the same endpoint paths as OpenRouter and OpenAI. All request and response schemas are identical.

Method	Path	Description
POST	/api/v1/chat/completions	Create a chat completion
POST	/api/v1/images/generations	Generate images
POST	/api/v1/audio/speech	Text-to-speech synthesis
POST	/api/v1/audio/transcriptions	Speech-to-text transcription
GET	/api/v1/models	List available models
GET	/api/v1/generation?id=	Get generation details
GET	/api/v1/credits	Check credit balance
GET	/api/v1/keys	List API keys
POST	/api/v1/keys	Create a new API key
PATCH	/api/v1/keys/:id	Update an API key
DELETE	/api/v1/keys/:id	Delete an API key

POST /api/v1/chat/completions

Create a chat completion. This is the primary endpoint for interacting with any AI model through AI Router. The request and response format is identical to the OpenAI Chat Completions API.

Request body

{
  "model": "openai/gpt-5.4",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Response

{
  "id": "gen-abc123",
  "object": "chat.completion",
  "created": 1713200000,
  "model": "openai/gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum mechanics..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Model ID format

Model IDs follow the provider/model-name format. Examples:

openai/gpt-5.4
anthropic/claude-opus-4-7
google/gemini-3-pro
deepseek/deepseek-v3.2
mistral/mistral-large-3
xai/grok-4

Streaming

Set "stream": true to receive Server-Sent Events (SSE). The response format follows the OpenAI streaming specification with data: [DONE] as the terminal event.

POST /api/v1/images/generations

Generate images using GPT-Image-1, Imagen 4, FLUX 1.1 Pro and other frontier image models. The request format follows the OpenAI Images API specification.

cURL

curl https://api.airouter.kz/api/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "openai/gpt-image-1",
    "prompt": "a white siamese cat",
    "n": 1,
    "size": "1024x1024"
  }'

Response

{
  "created": 1713200000,
  "data": [
    {
      "url": "https://...",
      "revised_prompt": "A white Siamese cat with blue eyes..."
    }
  ]
}

Supported sizes

1024x1024
1792x1024
1024x1792

POST /api/v1/audio/speech

Convert text to natural-sounding speech. Returns raw audio bytes with Content-Type: audio/mpeg by default.

cURL

curl https://api.airouter.kz/api/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "openai/gpt-4o-mini-tts",
    "input": "Hello, welcome to AI Router!",
    "voice": "alloy"
  }' \
  --output speech.mp3

Available voices

alloyechofableonyxnovashimmer

Response format

The response is raw audio bytes (not JSON). The default content type is audio/mpeg. You can request other formats by setting response_format to opus, aac, or flac.

POST /api/v1/audio/transcriptions

Transcribe audio files to text using GPT-4o Transcribe and other speech-to-text models. Accepts multipart/form-data requests.

cURL

curl https://api.airouter.kz/api/v1/audio/transcriptions \
  -H "Authorization: Bearer air_live_your_key_here" \
  -F file="@audio.mp3" \
  -F model="openai/gpt-4o-transcribe"

Response

{
  "text": "Hello, welcome to AI Router!"
}

Supported audio formats

.mp3.mp4.mpeg.mpga.m4a.wav.webm

Maximum file size: 25 MB

GET /api/v1/models

List all available models. Returns model IDs, pricing, context window sizes, and capabilities. No authentication required.

cURL

curl https://api.airouter.kz/api/v1/models

Response (truncated)

{
  "object": "list",
  "data": [
    {
      "id": "openai/gpt-5.4",
      "object": "model",
      "created": 1713200000,
      "owned_by": "openai",
      "pricing": {
        "prompt": "0.0000025",
        "completion": "0.000020"
      },
      "context_length": 1048576,
      "top_provider": {
        "max_completion_tokens": 131072
      }
    }
  ]
}

GET /api/v1/generation?id=

Retrieve details about a specific generation, including token counts, cost, latency, and which provider handled the request.

cURL

curl "https://api.airouter.kz/api/v1/generation?id=gen-abc123" \
  -H "Authorization: Bearer air_live_your_key_here"

Response

{
  "id": "gen-abc123",
  "model": "openai/gpt-5.4",
  "created_at": "2026-04-15T10:30:00Z",
  "tokens_prompt": 25,
  "tokens_completion": 150,
  "total_cost": 0.003062,
  "latency_ms": 1250,
  "provider": "openai",
  "status": "completed"
}

GET /api/v1/credits

Check your current credit balance. Returns the balance as a floating-point USD value for OpenRouter compatibility.

cURL

curl https://api.airouter.kz/api/v1/credits \
  -H "Authorization: Bearer air_live_your_key_here"

Response

{
  "data": {
    "total_credits": 100.00,
    "total_usage": 23.45,
    "remaining": 76.55
  }
}

API Key Management

Create, list, update, and delete API keys programmatically. Requires session authentication (dashboard login) or a management API key.

Create a new API key

curl -X POST https://api.airouter.kz/api/v1/keys \
  -H "Authorization: Bearer air_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-backend",
    "rate_limit": 100
  }'

List all keys

curl https://api.airouter.kz/api/v1/keys \
  -H "Authorization: Bearer air_live_your_key_here"

Delete a key

curl -X DELETE https://api.airouter.kz/api/v1/keys/key_id_here \
  -H "Authorization: Bearer air_live_your_key_here"

Self-Hosted Models

AI Router runs 20 popular open-weight models on our own GPU infrastructure with an optimized inference stack. These models use the same API as any other model — no special configuration needed.

Model ID format

Self-hosted model IDs use the airouter-cloud/ prefix. Examples:

airouter-cloud/llama-4-maverick
airouter-cloud/qwen-3-235b
airouter-cloud/deepseek-v3.2
airouter-cloud/gemma-3-27b
airouter-cloud/mistral-large-3

Python — using a self-hosted model

from openai import OpenAI

client = OpenAI(
    base_url="https://api.airouter.kz/api/v1",
    api_key="air_live_your_key_here"
)

response = client.chat.completions.create(
    model="airouter-cloud/llama-4-maverick",
    messages=[
        {"role": "user", "content": "Write a quicksort in Python"}
    ]
)

print(response.choices[0].message.content)

Data stays local

Your data never leaves our servers. No third-party routing for self-hosted models.

Dedicated GPUs

Models run on dedicated NVIDIA GPUs with guaranteed compute capacity.

Custom deployment

Need a specific model? We deploy any HuggingFace model within 24 hours.

See all available self-hosted models on the Self-Hosted Models page.

Error Handling

AI Router returns standard HTTP status codes and JSON error responses compatible with the OpenAI error format.

Status	Meaning
400	Bad request — malformed JSON or missing required fields
401	Unauthorized — invalid or missing API key
402	Payment required — insufficient credit balance
404	Not found — unknown model or endpoint
429	Rate limited — too many requests per second
500	Internal error — unexpected server failure
502	Provider error — upstream provider returned an error
503	Provider unavailable — upstream provider is down

Error response format

{
  "error": {
    "message": "Insufficient credits. Please add funds to your account.",
    "type": "insufficient_credits",
    "code": 402
  }
}

Rate Limits

Rate limits are applied per API key. Default limits can be customized per key through the dashboard or API.

Default limits

60 requests/minute per API key (configurable)
Rate limit headers included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
429 status returned when limit is exceeded, with Retry-After header

Need higher limits? Open the contact form for custom rate limit configurations.

Ready to integrate?

Create your account and start making API calls in minutes.

Get API Key Browse available models →