API Reference
A complete OpenAI-compatible REST API for LLM chat, image generation, and more. All endpoints accept JSON and return JSON.
Quick Start
Make your first API call in seconds. Replace YOUR_API_KEY with the token from your account dashboard.
curl https://your-domain.com/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}'The base URL for all API calls is https://your-domain.com/api/v1. The API is fully compatible with the OpenAI SDK — just swap the baseURL and your API key.
Authentication
Every request must include an Authorization header with a Bearer token. Obtain your token from the API Keys section of the console.
# Include your API key in the Authorization header
curl https://your-domain.com/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{ ... }'Keep your API key secret. Do not include it in client-side code or public repositories. Rotate it immediately if it is ever compromised.
Gemini-Compatible Authentication
For the Gemini-compatible endpoint at /api/v1beta/models/..., pass your key in the x-goog-api-key header instead of Authorization: Bearer.
Chat Completions
/api/v1/chat/completionsGenerate a model response for a conversationFully compatible with the OpenAI Chat Completions API. Supports streaming via Server-Sent Events when stream: true is set. Supported models include GPT-4o, Claude Sonnet/Opus, Gemini 2.5 Pro, and DeepSeek.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | ID of the model to use (e.g. gpt-4o, claude-sonnet-4-20250514). |
| messages | array | Required | Array of message objects with role (system | user | assistant) and content. |
| temperature | number | Optional | Sampling temperature between 0 and 2. Defaults to 1. |
| max_tokens | integer | Optional | Maximum tokens in the completion. |
| stream | boolean | Optional | If true, partial message deltas are sent as Server-Sent Events. Defaults to false. |
| top_p | number | Optional | Nucleus sampling parameter. Defaults to 1. |
| stop | string | array | Optional | Up to 4 sequences where the API will stop generating further tokens. |
curl
curl https://your-domain.com/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is the capital of France?" }
],
"temperature": 0.7,
"max_tokens": 512,
"stream": false
}'Python (openai SDK)
import openai
client = openai.OpenAI(
api_key="YOUR_API_KEY",
base_url="https://your-domain.com/api/v1",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
temperature=0.7,
max_tokens=512,
)
print(response.choices[0].message.content)Node.js (openai SDK)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://your-domain.com/api/v1",
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What is the capital of France?" },
],
temperature: 0.7,
max_tokens: 512,
});
console.log(response.choices[0].message.content);Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1709150000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 10,
"total_tokens": 34
}
}Image Generation
/api/v1/images/generationsGenerate images from a text promptOpenAI-compatible image generation endpoint. Supports DALL-E 3, DALL-E 2, and Stable Diffusion models. Returns image URLs or base64-encoded PNG data.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Image model to use (e.g. dall-e-3, dall-e-2). |
| prompt | string | Required | Text description of the desired image. Max 4000 characters. |
| size | string | Optional | Image dimensions. Options: 256x256, 512x512, 1024x1024, 1792x1024, 1024x1792. Defaults to 1024x1024. |
| quality | string | Optional | Image quality: standard or hd. Defaults to standard. (DALL-E 3 only) |
| n | integer | Optional | Number of images to generate (1–4). Defaults to 1. |
| response_format | string | Optional | Format of the response: url or b64_json. Defaults to url. |
curl
curl https://your-domain.com/api/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "dall-e-3",
"prompt": "A photorealistic cat sitting on a rooftop at sunset",
"size": "1024x1024",
"quality": "standard",
"n": 1
}'Python (openai SDK)
import openai
client = openai.OpenAI(
api_key="YOUR_API_KEY",
base_url="https://your-domain.com/api/v1",
)
response = client.images.generate(
model="dall-e-3",
prompt="A photorealistic cat sitting on a rooftop at sunset",
size="1024x1024",
quality="standard",
n=1,
)
print(response.data[0].url)Node.js (openai SDK)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://your-domain.com/api/v1",
});
const response = await client.images.generate({
model: "dall-e-3",
prompt: "A photorealistic cat sitting on a rooftop at sunset",
size: "1024x1024",
quality: "standard",
n: 1,
});
console.log(response.data[0].url);Response
{
"created": 1709150000,
"data": [
{
"url": "https://cdn.example.com/images/generated-abc123.png",
"revised_prompt": "A highly detailed photorealistic cat..."
}
]
}List Models
/api/v1/modelsList all available modelsReturns a list of all models available on the platform. Compatible with the OpenAI client.models.list() call.
curl
curl https://your-domain.com/api/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"object": "list",
"data": [
{ "id": "gpt-4o", "object": "model", "owned_by": "openai" },
{ "id": "gpt-4o-mini", "object": "model", "owned_by": "openai" },
{ "id": "claude-sonnet-4-20250514", "object": "model", "owned_by": "anthropic" },
{ "id": "gemini-2.5-pro", "object": "model", "owned_by": "google" },
{ "id": "gemini-2.5-flash", "object": "model", "owned_by": "google" },
{ "id": "dall-e-3", "object": "model", "owned_by": "openai" }
]
}Gemini-Compatible API
/api/v1beta/models/{model}:generateContentGoogle Gemini-compatible generateContent endpointA drop-in replacement for the Google Gemini REST API. Pass your platform API key via the x-goog-api-key header. Replace {model} with the model ID (e.g. gemini-2.5-pro, gemini-2.5-flash).
This endpoint accepts the native Gemini contents format and returns a Gemini-shaped response so existing Gemini SDK code works without modification.
curl
curl "https://your-domain.com/api/v1beta/models/gemini-2.5-pro:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: YOUR_API_KEY" \
-d '{
"contents": [
{
"parts": [{ "text": "Explain quantum entanglement simply." }]
}
]
}'Response
{
"candidates": [
{
"content": {
"parts": [
{ "text": "Quantum entanglement is a phenomenon..." }
],
"role": "model"
},
"finishReason": "STOP"
}
],
"usageMetadata": {
"promptTokenCount": 8,
"candidatesTokenCount": 120,
"totalTokenCount": 128
}
}Rate Limits
Rate limits are applied per API key on a per-minute (RPM) basis. When a limit is exceeded the API returns a 429 Too Many Requests response.
| Plan | RPM (Chat) | RPM (Images) | Notes |
|---|---|---|---|
| Free | 10 | 5 | For evaluation only |
| Standard | 60 | 20 | Default for paying accounts |
| Pro | 300 | 60 | Higher throughput |
| Enterprise | Custom | Custom | Contact sales |
Rate-limit headers
Responses include the following headers to help you track your usage:
X-RateLimit-Limit-RequestsMaximum requests allowed in the windowX-RateLimit-Remaining-RequestsRequests remaining in the current windowX-RateLimit-Reset-RequestsUTC epoch time when the window resetsError Codes
All errors follow the OpenAI error shape: { error: { message, type, code } }.
| HTTP Code | Error Type | Description |
|---|---|---|
| 400 | invalid_request_error | Malformed request body or missing required parameters. |
| 401 | authentication_error | Missing or invalid API key. Check the Authorization header. |
| 403 | permission_error | API access is disabled for your account or the requested model is not available on your plan. |
| 404 | not_found_error | The requested resource or model does not exist. |
| 429 | rate_limit_error | You have exceeded your rate limit. Slow down and retry after the window resets. |
| 500 | internal_server_error | An unexpected error occurred on the server. Retry with exponential back-off. |
| 503 | service_unavailable_error | The upstream model provider is temporarily unavailable. |
Example error response
{
"error": {
"message": "Invalid API key provided.",
"type": "authentication_error",
"code": "invalid_api_key"
}
}