API Reference

A complete OpenAI-compatible REST API for LLM chat, image generation, and more. All endpoints accept JSON and return JSON.

Quick Start

Make your first API call in seconds. Replace YOUR_API_KEY with the token from your account dashboard.

bash

curl https://your-domain.com/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

The base URL for all API calls is https://your-domain.com/api/v1. The API is fully compatible with the OpenAI SDK — just swap the baseURL and your API key.

Authentication

Every request must include an Authorization header with a Bearer token. Obtain your token from the API Keys section of the console.

bash

# Include your API key in the Authorization header
curl https://your-domain.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ ... }'

Keep your API key secret. Do not include it in client-side code or public repositories. Rotate it immediately if it is ever compromised.

Gemini-Compatible Authentication

For the Gemini-compatible endpoint at /api/v1beta/models/..., pass your key in the x-goog-api-key header instead of Authorization: Bearer.

Chat Completions

POST/api/v1/chat/completionsGenerate a model response for a conversation

Fully compatible with the OpenAI Chat Completions API. Supports streaming via Server-Sent Events when stream: true is set. Supported models include GPT-4o, Claude Sonnet/Opus, Gemini 2.5 Pro, and DeepSeek.

Request Parameters

Parameter	Type	Required	Description
model	string	Required	ID of the model to use (e.g. gpt-4o, claude-sonnet-4-20250514).
messages	array	Required	Array of message objects with role (system \| user \| assistant) and content.
temperature	number	Optional	Sampling temperature between 0 and 2. Defaults to 1.
max_tokens	integer	Optional	Maximum tokens in the completion.
stream	boolean	Optional	If true, partial message deltas are sent as Server-Sent Events. Defaults to false.
top_p	number	Optional	Nucleus sampling parameter. Defaults to 1.
stop	string \| array	Optional	Up to 4 sequences where the API will stop generating further tokens.

curl

bash

curl https://your-domain.com/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user",   "content": "What is the capital of France?" }
    ],
    "temperature": 0.7,
    "max_tokens": 512,
    "stream": false
  }'

Python (openai SDK)

python

import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://your-domain.com/api/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "What is the capital of France?"},
    ],
    temperature=0.7,
    max_tokens=512,
)

print(response.choices[0].message.content)

Node.js (openai SDK)

typescript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://your-domain.com/api/v1",
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user",   content: "What is the capital of France?" },
  ],
  temperature: 0.7,
  max_tokens: 512,
});

console.log(response.choices[0].message.content);

Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709150000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 10,
    "total_tokens": 34
  }
}

Image Generation

POST/api/v1/images/generationsGenerate images from a text prompt

OpenAI-compatible image generation endpoint. Supports DALL-E 3, DALL-E 2, and Stable Diffusion models. Returns image URLs or base64-encoded PNG data.

Request Parameters

Parameter	Type	Required	Description
model	string	Required	Image model to use (e.g. dall-e-3, dall-e-2).
prompt	string	Required	Text description of the desired image. Max 4000 characters.
size	string	Optional	Image dimensions. Options: 256x256, 512x512, 1024x1024, 1792x1024, 1024x1792. Defaults to 1024x1024.
quality	string	Optional	Image quality: standard or hd. Defaults to standard. (DALL-E 3 only)
n	integer	Optional	Number of images to generate (1–4). Defaults to 1.
response_format	string	Optional	Format of the response: url or b64_json. Defaults to url.

curl

bash

curl https://your-domain.com/api/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A photorealistic cat sitting on a rooftop at sunset",
    "size": "1024x1024",
    "quality": "standard",
    "n": 1
  }'

Python (openai SDK)

python

import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://your-domain.com/api/v1",
)

response = client.images.generate(
    model="dall-e-3",
    prompt="A photorealistic cat sitting on a rooftop at sunset",
    size="1024x1024",
    quality="standard",
    n=1,
)

print(response.data[0].url)

Node.js (openai SDK)

typescript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://your-domain.com/api/v1",
});

const response = await client.images.generate({
  model: "dall-e-3",
  prompt: "A photorealistic cat sitting on a rooftop at sunset",
  size: "1024x1024",
  quality: "standard",
  n: 1,
});

console.log(response.data[0].url);

Response

json

{
  "created": 1709150000,
  "data": [
    {
      "url": "https://cdn.example.com/images/generated-abc123.png",
      "revised_prompt": "A highly detailed photorealistic cat..."
    }
  ]
}

List Models

GET/api/v1/modelsList all available models

Returns a list of all models available on the platform. Compatible with the OpenAI client.models.list() call.

curl

bash

curl https://your-domain.com/api/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

json

{
  "object": "list",
  "data": [
    { "id": "gpt-4o",                    "object": "model", "owned_by": "openai" },
    { "id": "gpt-4o-mini",               "object": "model", "owned_by": "openai" },
    { "id": "claude-sonnet-4-20250514",  "object": "model", "owned_by": "anthropic" },
    { "id": "gemini-2.5-pro",            "object": "model", "owned_by": "google" },
    { "id": "gemini-2.5-flash",          "object": "model", "owned_by": "google" },
    { "id": "dall-e-3",                  "object": "model", "owned_by": "openai" }
  ]
}

Gemini-Compatible API

POST/api/v1beta/models/{model}:generateContentGoogle Gemini-compatible generateContent endpoint

A drop-in replacement for the Google Gemini REST API. Pass your platform API key via the x-goog-api-key header. Replace {model} with the model ID (e.g. gemini-2.5-pro, gemini-2.5-flash).

This endpoint accepts the native Gemini contents format and returns a Gemini-shaped response so existing Gemini SDK code works without modification.

curl

bash

curl "https://your-domain.com/api/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -d '{
    "contents": [
      {
        "parts": [{ "text": "Explain quantum entanglement simply." }]
      }
    ]
  }'

Response

json

{
  "candidates": [
    {
      "content": {
        "parts": [
          { "text": "Quantum entanglement is a phenomenon..." }
        ],
        "role": "model"
      },
      "finishReason": "STOP"
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 8,
    "candidatesTokenCount": 120,
    "totalTokenCount": 128
  }
}

Rate Limits

Rate limits are applied per API key on a per-minute (RPM) basis. When a limit is exceeded the API returns a 429 Too Many Requests response.

Plan	RPM (Chat)	RPM (Images)	Notes
Free	10	5	For evaluation only
Standard	60	20	Default for paying accounts
Pro	300	60	Higher throughput
Enterprise	Custom	Custom	Contact sales

Rate-limit headers

Responses include the following headers to help you track your usage:

X-RateLimit-Limit-RequestsMaximum requests allowed in the window

X-RateLimit-Remaining-RequestsRequests remaining in the current window

X-RateLimit-Reset-RequestsUTC epoch time when the window resets

Error Codes

All errors follow the OpenAI error shape: { error: { message, type, code } }.

HTTP Code	Error Type	Description
400	invalid_request_error	Malformed request body or missing required parameters.
401	authentication_error	Missing or invalid API key. Check the Authorization header.
403	permission_error	API access is disabled for your account or the requested model is not available on your plan.
404	not_found_error	The requested resource or model does not exist.
429	rate_limit_error	You have exceeded your rate limit. Slow down and retry after the window resets.
500	internal_server_error	An unexpected error occurred on the server. Retry with exponential back-off.
503	service_unavailable_error	The upstream model provider is temporarily unavailable.

Example error response

json

{
  "error": {
    "message": "Invalid API key provided.",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}