QNP API Documentation

OpenAI-compatible proxy with configurable per-key fallback across providers.

Introduction

QNP is an OpenAI-compatible API gateway. Point any OpenAI SDK at our base URL and your existing code routes across Anthropic, OpenAI, Google, Mistral, Groq, and more — with automatic fallback.

Configurable fallback — set provider order per key.
OpenAI SDK compatible — drop-in replacement, no code changes.
BYOK or platform credits — bring your own keys, or pay-as-you-go.
Supported providers — OpenAI, Anthropic, Google, Mistral, Groq, and more.
Unified usage, billing, and monitoring.

Production base URL: {url}

For chat, model names qnp or qnp/auto resolve to the first model on your fallback chain. Otherwise QNP uses an intent-based default. Your fallback order still applies.

Quickstart Guide

Create an Account

Sign up at qnp.ai/login using Google, Apple, Passkey, or email magic link. You'll get a free tier with 50 requests per day.

Get Your API Key

Go to Dashboard → API Keys and create a new key. Configure your provider chain. Copy the key — it starts with qnp- and is only shown once.

Make Your First Request

Use any OpenAI-compatible SDK or cURL. Just change the base URL and API key (see below).

Monitor Usage

View request logs, costs, and latency in the Dashboard → Logs page. Export data as CSV for analysis.

cURL

curl -X POST "https://qnp.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"Hello!"}]}'

Authentication

All API requests require authentication via your QNP key. Two header formats are supported:

Authorization: Bearer YOUR_KEY
# or
X-API-Key: YOUR_KEY

Key management

Keys are created in the dashboard and shown only once at creation.
Each key has its own provider chain — choose which providers to fall back across.
Keys can be scoped with allowed IPs, referrers, models, and per-day budgets.
Rotate keys without downtime — old key stays valid for a grace window.
Revoke keys instantly if compromised — old key returns 401 on next call.
Free tier: up to 5 active keys, 1,000 requests/day.

Key scoping

Provider chain — which providers to try, and in what order.
Allowed IPs — restrict key usage to specific IP addresses or CIDRs.
Allowed referrers — restrict key usage to specific origins.
Permissions — chat, embeddings, images, or audio per key.
Budget — set per-day spending limits in USD.
Rate limits — custom RPM/TPM ceilings per key.

Fallback Configuration

Per API key, you define a provider chain. QNP tries each in order until one returns a successful response.

Example: OpenAI (primary) → Anthropic (fallback 1) → Groq (fallback 2). If OpenAI 5xx's, traffic flows to Anthropic.

Configure fallback in the dashboard under Keys → [your key] → Routing.

With no fallback configured, the key uses platform credits — billed against your account.

Chat Completions

POST/v1/chat/completions

OpenAI-compatible chat endpoint. Supports streaming, function calling, and vision inputs.

cURL

curl -X POST "https://qnp.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"Hello!"}]}'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://qnp.ai/api/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="openai/gpt-4o-mini",  # routes to your fallback chain
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://qnp.ai/api/v1",
  apiKey: "YOUR_KEY",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4o-mini", // routes to your fallback chain
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Embeddings

POST/v1/embeddings

Create vector embeddings for text — useful for search, clustering, and RAG.

cURL

curl -X POST "https://qnp.ai/api/v1/embeddings" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","input":"The quick brown fox"}'

Python

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox"
)
print(response.data[0].embedding[:5])

Images

POST/v1/images/generations

Generate images via DALL-E and other image models.

Models

GET/v1/models

List models available for your API key.

cURL

curl "https://qnp.ai/api/v1/models" \
  -H "Authorization: Bearer YOUR_KEY"

Streaming

QNP supports Server-Sent Events (SSE) streaming for chat completions. Set "stream": true in your request body. The response is a stream of data: events in OpenAI format.

Each SSE event contains a JSON chunk with delta content. The stream ends with data: [DONE].

cURL

curl -X POST "https://qnp.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Hello!"}]}'

Python

stream = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Model Catalog

Loading the live model catalog…

Rate Limits

Rate limits are enforced per API key. Default ceilings vary by tier:

Tier	Requests per day	RPM (default)	API keys
Free	50 RPD	10 RPM	5
Pro	Unlimited	Unlimited	Unlimited

Custom per-key rate limits (RPM, TPM) are configurable in the dashboard.

When rate limited, the API returns 429 Too Many Requests with a Retry-After header.

Error Codes

Errors follow OpenAI format: { error: { message, type, code } }

Code	Meaning	What to do
400	Bad Request	Check your request body format and required fields.
401	Unauthorized	Check your API key is valid and included in the header.
402	Insufficient Credits	Top up credits in the dashboard or upgrade to Pro.
403	Forbidden	Key lacks permission for this endpoint, or IP/referrer is blocked.
404	Not Found	Check the endpoint URL and model name.
429	Rate Limited	Wait and retry. Check the Retry-After header. Upgrade tier for higher limits.
500	Internal Server Error	Retry the request. If persistent, contact support.
502	Bad Gateway	Upstream provider error. Your fallback chain will auto-retry.
503	Service Unavailable	Provider temporarily unavailable. Fallback routing handles this.

SDK Examples

Use any OpenAI-compatible SDK. Just change the base URL and your QNP key.

Python

pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://qnp.ai/api/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="openai/gpt-4o-mini",  # routes to your fallback chain
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js

npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://qnp.ai/api/v1",
  apiKey: "YOUR_KEY",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4o-mini", // routes to your fallback chain
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

LangChain

Python

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://qnp.ai/api/v1",
    api_key="YOUR_KEY",
    model="openai/gpt-4o-mini"  # routes to your fallback chain
)
response = llm.invoke("Hello!")
print(response.content)

Cline Setup

Cline is an AI-powered VS Code coding assistant. QNP works as Cline's API provider.

Steps

Open Cline settings (click the gear icon).
Set API Provider to OpenAI Compatible.
Base URL: {url}.
API Key: Your QNP key (starts with qnp-).
Model: Use {model} (or any other supported model). Pass qnp/auto if you want QNP to pick from your fallback chain.

Tip: configure your API key provider chain in Dashboard → API Keys. The qnp/auto model uses the first model in that chain; if it fails, the next is tried automatically.

Troubleshooting

400 invalid_request_error: Ensure your key has at least one provider in its fallback chain. Add Anthropic or OpenAI in Dashboard → Keys → Routing.
Invalid API Key: Confirm the key starts with qnp- and is active.
Base URL: Must end with /v1 (no trailing slash).

Setup Script

The setup script auto-detects and configures OpenClaw, Claude Code, and Hermes Agent. Use QNP as the model provider for unified fallback and BYOK.

The setup script will interactively prompt you for your API key and base URL.

bash

curl -fsSL https://qnp.ai/setup.sh | bash

What the script does

Detects installed tools (OpenClaw, Claude Code, Hermes Agent) and configures QNP as provider for each.
Sets input: ["text","image","audio","video"] for all models.
Creates a backup before changes.