Skip to content

API Documentation

AI Router is fully compatible with the OpenAI API format. Use any existing SDK or HTTP client — just change the base URL and API key.

Quick Start

Get up and running with AI Router in under 2 minutes. If you already use the OpenAI SDK, you only need to change two parameters.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.airouter.kz/api/v1",
    api_key="air_live_your_key_here"
)

response = client.chat.completions.create(
    model="openai/gpt-5.4",
    messages=[
        {"role": "user", "content": "Summarize the OpenAI Chat Completions schema in one sentence."}
    ]
)

print(response.choices[0].message.content)
JavaScript / TypeScript
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.airouter.kz/api/v1",
    apiKey: "air_live_your_key_here"
});

const response = await client.chat.completions.create({
    model: "anthropic/claude-opus-4-7",
    messages: [
        { role: "user", content: "Summarize the OpenAI Chat Completions schema in one sentence." }
    ]
});

console.log(response.choices[0].message.content);
cURL
curl https://api.airouter.kz/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "google/gemini-3-pro",
    "messages": [
      {"role": "user", "content": "Summarize the OpenAI Chat Completions schema in one sentence."}
    ]
  }'

Authentication

All API requests require an API key passed in the Authorization header as a Bearer token.

API key format

AI Router API keys use the prefix air_live_ followed by 43 base64url characters. Example:

air_live_abc123def456ghi789jkl012mno345pqr678stu

Header format

Authorization: Bearer air_live_your_key_here

Generate and manage API keys from your dashboard or via the API key management endpoints.

Base URL

$https://api.airouter.kz/api/v1

All endpoints are relative to this base URL. The API is fully compatible with OpenRouter and OpenAI endpoint paths.

SDKs & Libraries

AI Router works with any SDK that supports custom base URLs. No special SDK needed — use what you already have.

OpenAI Python SDK

pip install openai

Set base_url to https://api.airouter.kz/api/v1

OpenAI Node.js SDK

npm install openai

Set baseURL to https://api.airouter.kz/api/v1

LangChain

pip install langchain-openai

Use ChatOpenAI with the openai_api_base parameter

LlamaIndex

pip install llama-index-llms-openai

Set api_base on the OpenAI LLM class

Endpoints

AI Router implements the same endpoint paths as OpenRouter and OpenAI. All request and response schemas are identical.

MethodPathDescription
POST/api/v1/chat/completionsCreate a chat completion
POST/api/v1/images/generationsGenerate images
POST/api/v1/audio/speechText-to-speech synthesis
POST/api/v1/audio/transcriptionsSpeech-to-text transcription
GET/api/v1/modelsList available models
GET/api/v1/generation?id=Get generation details
GET/api/v1/creditsCheck credit balance
GET/api/v1/keysList API keys
POST/api/v1/keysCreate a new API key
PATCH/api/v1/keys/:idUpdate an API key
DELETE/api/v1/keys/:idDelete an API key

POST /api/v1/chat/completions

Create a chat completion. This is the primary endpoint for interacting with any AI model through AI Router. The request and response format is identical to the OpenAI Chat Completions API.

Request body
{
  "model": "openai/gpt-5.4",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms."
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}
Response
{
  "id": "gen-abc123",
  "object": "chat.completion",
  "created": 1713200000,
  "model": "openai/gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum mechanics..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Model ID format

Model IDs follow the provider/model-name format. Examples:

  • openai/gpt-5.4
  • anthropic/claude-opus-4-7
  • google/gemini-3-pro
  • deepseek/deepseek-v3.2
  • mistral/mistral-large-3
  • xai/grok-4

Streaming

Set "stream": true to receive Server-Sent Events (SSE). The response format follows the OpenAI streaming specification with data: [DONE] as the terminal event.

POST /api/v1/images/generations

Generate images using GPT-Image-1, Imagen 4, FLUX 1.1 Pro and other frontier image models. The request format follows the OpenAI Images API specification.

cURL
curl https://api.airouter.kz/api/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "openai/gpt-image-1",
    "prompt": "a white siamese cat",
    "n": 1,
    "size": "1024x1024"
  }'
Response
{
  "created": 1713200000,
  "data": [
    {
      "url": "https://...",
      "revised_prompt": "A white Siamese cat with blue eyes..."
    }
  ]
}

Supported sizes

  • 1024x1024
  • 1792x1024
  • 1024x1792

POST /api/v1/audio/speech

Convert text to natural-sounding speech. Returns raw audio bytes with Content-Type: audio/mpeg by default.

cURL
curl https://api.airouter.kz/api/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer air_live_your_key_here" \
  -d '{
    "model": "openai/gpt-4o-mini-tts",
    "input": "Hello, welcome to AI Router!",
    "voice": "alloy"
  }' \
  --output speech.mp3

Available voices

alloyechofableonyxnovashimmer

Response format

The response is raw audio bytes (not JSON). The default content type is audio/mpeg. You can request other formats by setting response_format to opus, aac, or flac.

POST /api/v1/audio/transcriptions

Transcribe audio files to text using GPT-4o Transcribe and other speech-to-text models. Accepts multipart/form-data requests.

cURL
curl https://api.airouter.kz/api/v1/audio/transcriptions \
  -H "Authorization: Bearer air_live_your_key_here" \
  -F file="@audio.mp3" \
  -F model="openai/gpt-4o-transcribe"
Response
{
  "text": "Hello, welcome to AI Router!"
}

Supported audio formats

.mp3.mp4.mpeg.mpga.m4a.wav.webm

Maximum file size: 25 MB

GET /api/v1/models

List all available models. Returns model IDs, pricing, context window sizes, and capabilities. No authentication required.

cURL
curl https://api.airouter.kz/api/v1/models
Response (truncated)
{
  "object": "list",
  "data": [
    {
      "id": "openai/gpt-5.4",
      "object": "model",
      "created": 1713200000,
      "owned_by": "openai",
      "pricing": {
        "prompt": "0.0000025",
        "completion": "0.000020"
      },
      "context_length": 1048576,
      "top_provider": {
        "max_completion_tokens": 131072
      }
    }
  ]
}

GET /api/v1/generation?id=

Retrieve details about a specific generation, including token counts, cost, latency, and which provider handled the request.

cURL
curl "https://api.airouter.kz/api/v1/generation?id=gen-abc123" \
  -H "Authorization: Bearer air_live_your_key_here"
Response
{
  "id": "gen-abc123",
  "model": "openai/gpt-5.4",
  "created_at": "2026-04-15T10:30:00Z",
  "tokens_prompt": 25,
  "tokens_completion": 150,
  "total_cost": 0.003062,
  "latency_ms": 1250,
  "provider": "openai",
  "status": "completed"
}

GET /api/v1/credits

Check your current credit balance. Returns the balance as a floating-point USD value for OpenRouter compatibility.

cURL
curl https://api.airouter.kz/api/v1/credits \
  -H "Authorization: Bearer air_live_your_key_here"
Response
{
  "data": {
    "total_credits": 100.00,
    "total_usage": 23.45,
    "remaining": 76.55
  }
}

API Key Management

Create, list, update, and delete API keys programmatically. Requires session authentication (dashboard login) or a management API key.

Create a new API key
curl -X POST https://api.airouter.kz/api/v1/keys \
  -H "Authorization: Bearer air_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-backend",
    "rate_limit": 100
  }'
List all keys
curl https://api.airouter.kz/api/v1/keys \
  -H "Authorization: Bearer air_live_your_key_here"
Delete a key
curl -X DELETE https://api.airouter.kz/api/v1/keys/key_id_here \
  -H "Authorization: Bearer air_live_your_key_here"

Self-Hosted Models

AI Router runs 20 popular open-weight models on our own GPU infrastructure with an optimized inference stack. These models use the same API as any other model — no special configuration needed.

Model ID format

Self-hosted model IDs use the airouter-cloud/ prefix. Examples:

  • airouter-cloud/llama-4-maverick
  • airouter-cloud/qwen-3-235b
  • airouter-cloud/deepseek-v3.2
  • airouter-cloud/gemma-3-27b
  • airouter-cloud/mistral-large-3
Python — using a self-hosted model
from openai import OpenAI

client = OpenAI(
    base_url="https://api.airouter.kz/api/v1",
    api_key="air_live_your_key_here"
)

response = client.chat.completions.create(
    model="airouter-cloud/llama-4-maverick",
    messages=[
        {"role": "user", "content": "Write a quicksort in Python"}
    ]
)

print(response.choices[0].message.content)

Data stays local

Your data never leaves our servers. No third-party routing for self-hosted models.

Dedicated GPUs

Models run on dedicated NVIDIA GPUs with guaranteed compute capacity.

Custom deployment

Need a specific model? We deploy any HuggingFace model within 24 hours.

See all available self-hosted models on the Self-Hosted Models page.

Error Handling

AI Router returns standard HTTP status codes and JSON error responses compatible with the OpenAI error format.

StatusMeaning
400Bad request — malformed JSON or missing required fields
401Unauthorized — invalid or missing API key
402Payment required — insufficient credit balance
404Not found — unknown model or endpoint
429Rate limited — too many requests per second
500Internal error — unexpected server failure
502Provider error — upstream provider returned an error
503Provider unavailable — upstream provider is down
Error response format
{
  "error": {
    "message": "Insufficient credits. Please add funds to your account.",
    "type": "insufficient_credits",
    "code": 402
  }
}

Rate Limits

Rate limits are applied per API key. Default limits can be customized per key through the dashboard or API.

Default limits

  • 60 requests/minute per API key (configurable)
  • Rate limit headers included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
  • 429 status returned when limit is exceeded, with Retry-After header

Need higher limits? Open the contact form for custom rate limit configurations.

Ready to integrate?

Create your account and start making API calls in minutes.