API Documentation
Drop-in OpenAI-compatible inference. Pay with an API key or let your agent pay per token.
Quick Start
With API key
curl https://api.vram.supply/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen3.5-9b",
"messages": [{"role": "user", "content": "Hello"}]
}'With MPP (no account needed)
# No auth → 402 challenge → pay → retry
npx mppx https://api.vram.supply/v1/chat/completions \
-d '{
"model": "qwen/qwen3.5-9b",
"messages": [{"role": "user", "content": "Hello"}]
}'Works with any OpenAI-compatible client
Any tool that supports a custom OpenAI base URL works out of the box. No SDK, no special integration.
Python OpenAI SDK
from openai import OpenAI
client = OpenAI(base_url="https://api.vram.supply/v1", api_key="sk-...")curl
curl https://api.vram.supply/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-d '{"model": "qwen/qwen3.5-9b", "messages": [{"role": "user", "content": "Hello"}]}'Aider
aider --openai-api-base https://api.vram.supply/v1 --openai-api-key sk-...Continue (VS Code)
{ "apiBase": "https://api.vram.supply/v1", "apiKey": "sk-..." }Explore the API
Using the API
Make inference requests, control routing and cost, manage API keys.
- → Inference & Streaming
- → Routing Strategies
- → Models
Account & Payments
API keys, card billing, MPP agent payments, and settings.
- → API Keys & MPP Auth
- → Billing (Stripe & Tempo)
- → Settings
Providing
Serve models, sell quota, and get paid.
- → GPU Provider API
- → Quota Selling
- → Payouts