API Reference
Rungate's API is fully compatible with OpenAI's client libraries, making it easy to use open-source models in your existing applications.
#Access methods
Rungate supports three ways to access the inference API: account-scoped API keys, x402 pay-per-request, and MPP pay-per-request. Environments can advertise one or both payment protocols when a request arrives without standard authentication.
| Method | Family | Credential | Billing model | Best fit |
|---|---|---|---|---|
| API keys | Account-based | Authorization: Bearer <RUNGATE_API_KEY> | Account credits | Server apps, stable account auth, account-based billing |
| x402 | Wallet-based | payment-signature | Wallet-backed pay per request | Clients that already support x402 challenge/response |
| MPP | Wallet-based | Authorization: Payment ... | Wallet-backed pay per request | MPP-capable clients and wallets |
#Configuration
To start using Rungate with OpenAI's client libraries, pass your Rungate API key and change the SDK base URL to https://api.rungate.ai/v1. For raw HTTP requests, use https://api.rungate.ai with /v1/... paths. You can find your API key in your account settings.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_RUNGATE_API_KEY",
base_url="https://api.rungate.ai/v1",
)RUNGATE_API_KEY) rather than hardcoding it.#Chat completions
Once your client is configured, you can query any of our open-source models. For example, here's a chat completion with DeepSeek V3.2.
response = client.chat.completions.create(
model="deepseek/deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
)
print(response.choices[0].message.content)See the Models & Pricing page for all available model IDs.
#Streaming
You can stream responses back using OpenAI's streaming interface. Pass stream=True to receive server-sent events as the model generates tokens.
stream = client.chat.completions.create(
model="qwen/qwen3-coder-next",
messages=[{"role": "user", "content": "Count to 5"}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content or ""
print(content, end="")The final chunk before [DONE] includes a usage field with token counts.
#Multimodal input
Models that support vision accept image inputs alongside text. Pass a content array instead of a plain string.
response = client.chat.completions.create(
model="qwen/qwen3-vl-235b-a22b-thinking",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/photo.jpg"
},
},
],
}
],
)
print(response.choices[0].message.content)Supported formats: image/jpeg, image/png, image/webp, image/gif. Base64 data URIs and public URLs are both accepted. See Models & Pricing for which models support vision.
#Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions | Create a chat completion |
| GET | /v1/models | List available models |
These paths are also available without the /v1 prefix (e.g. /chat/completions, /models).
#Usage tracking
Every response includes a usage object with token counts. For streaming responses it appears in the final chunk before [DONE].
{
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20
}
}#Rate limiting
Throughput limits vary per model — see the Models & Pricing page for details.
#Error codes
Errors follow the standard OpenAI error format:
{
"error": {
"code": 401,
"message": "Invalid API key"
}
}| Code | Meaning |
|---|---|
| 400 | Bad request — invalid or missing parameters |
| 402 | Payment required — satisfy an x402 or MPP challenge, or use an API key |
| 401 | Unauthorized — missing or invalid API key |
| 404 | Model not found |
| 429 | Rate limited — request exceeds concurrency limits |
| 500 | Internal server error |