OpenAI-compatible proxy with configurable per-key fallback across providers.
QNP is an OpenAI-compatible API gateway. Point any OpenAI SDK at our base URL and your existing code routes across Anthropic, OpenAI, Google, Mistral, Groq, and more — with automatic fallback.
Production base URL: {url}
For chat, model names qnp or qnp/auto resolve to the first model on your fallback chain. Otherwise QNP uses an intent-based default. Your fallback order still applies.
Sign up at qnp.ai/login using Google, Apple, Passkey, or email magic link. You'll get a free tier with 50 requests per day.
Go to Dashboard → API Keys and create a new key. Configure your provider chain. Copy the key — it starts with qnp- and is only shown once.
Use any OpenAI-compatible SDK or cURL. Just change the base URL and API key (see below).
View request logs, costs, and latency in the Dashboard → Logs page. Export data as CSV for analysis.
curl -X POST "https://qnp.ai/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"Hello!"}]}'All API requests require authentication via your QNP key. Two header formats are supported:
Authorization: Bearer YOUR_KEY # or X-API-Key: YOUR_KEY
Per API key, you define a provider chain. QNP tries each in order until one returns a successful response.
Example: OpenAI (primary) → Anthropic (fallback 1) → Groq (fallback 2). If OpenAI 5xx's, traffic flows to Anthropic.
Configure fallback in the dashboard under Keys → [your key] → Routing.
With no fallback configured, the key uses platform credits — billed against your account.
OpenAI-compatible chat endpoint. Supports streaming, function calling, and vision inputs.
curl -X POST "https://qnp.ai/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"Hello!"}]}'from openai import OpenAI
client = OpenAI(
base_url="https://qnp.ai/api/v1",
api_key="YOUR_KEY"
)
response = client.chat.completions.create(
model="openai/gpt-4o-mini", # routes to your fallback chain
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://qnp.ai/api/v1",
apiKey: "YOUR_KEY",
});
const response = await client.chat.completions.create({
model: "openai/gpt-4o-mini", // routes to your fallback chain
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);Create vector embeddings for text — useful for search, clustering, and RAG.
curl -X POST "https://qnp.ai/api/v1/embeddings" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"text-embedding-3-small","input":"The quick brown fox"}'response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox"
)
print(response.data[0].embedding[:5])Generate images via DALL-E and other image models.
List models available for your API key.
curl "https://qnp.ai/api/v1/models" \ -H "Authorization: Bearer YOUR_KEY"
QNP supports Server-Sent Events (SSE) streaming for chat completions. Set "stream": true in your request body. The response is a stream of data: events in OpenAI format.
Each SSE event contains a JSON chunk with delta content. The stream ends with data: [DONE].
curl -X POST "https://qnp.ai/api/v1/chat/completions" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Hello!"}]}'stream = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Loading the live model catalog…
Rate limits are enforced per API key. Default ceilings vary by tier:
| Tier | Requests per day | RPM (default) | API keys |
|---|---|---|---|
| Free | 50 RPD | 10 RPM | 5 |
| Pro | Unlimited | Unlimited | Unlimited |
Custom per-key rate limits (RPM, TPM) are configurable in the dashboard.
When rate limited, the API returns 429 Too Many Requests with a Retry-After header.
Errors follow OpenAI format: { error: { message, type, code } }
| Code | Meaning | What to do |
|---|---|---|
| 400 | Bad Request | Check your request body format and required fields. |
| 401 | Unauthorized | Check your API key is valid and included in the header. |
| 402 | Insufficient Credits | Top up credits in the dashboard or upgrade to Pro. |
| 403 | Forbidden | Key lacks permission for this endpoint, or IP/referrer is blocked. |
| 404 | Not Found | Check the endpoint URL and model name. |
| 429 | Rate Limited | Wait and retry. Check the Retry-After header. Upgrade tier for higher limits. |
| 500 | Internal Server Error | Retry the request. If persistent, contact support. |
| 502 | Bad Gateway | Upstream provider error. Your fallback chain will auto-retry. |
| 503 | Service Unavailable | Provider temporarily unavailable. Fallback routing handles this. |
Use any OpenAI-compatible SDK. Just change the base URL and your QNP key.
from openai import OpenAI
client = OpenAI(
base_url="https://qnp.ai/api/v1",
api_key="YOUR_KEY"
)
response = client.chat.completions.create(
model="openai/gpt-4o-mini", # routes to your fallback chain
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://qnp.ai/api/v1",
apiKey: "YOUR_KEY",
});
const response = await client.chat.completions.create({
model: "openai/gpt-4o-mini", // routes to your fallback chain
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://qnp.ai/api/v1",
api_key="YOUR_KEY",
model="openai/gpt-4o-mini" # routes to your fallback chain
)
response = llm.invoke("Hello!")
print(response.content)Cline is an AI-powered VS Code coding assistant. QNP works as Cline's API provider.
{url}.qnp-).{model} (or any other supported model). Pass qnp/auto if you want QNP to pick from your fallback chain.Tip: configure your API key provider chain in Dashboard → API Keys. The qnp/auto model uses the first model in that chain; if it fails, the next is tried automatically.
qnp- and is active./v1 (no trailing slash).The setup script auto-detects and configures OpenClaw, Claude Code, and Hermes Agent. Use QNP as the model provider for unified fallback and BYOK.
The setup script will interactively prompt you for your API key and base URL.
curl -fsSL https://qnp.ai/setup.sh | bash
input: ["text","image","audio","video"] for all models.