Providing
Earn by serving inference on the vram.supply network. Install the CLI agent, register your GPU, and start receiving requests.
Provider Lifecycle
- Set up payouts — choose one or both:
- Passkey signup — wallet auto-derived, zero extra steps.
- Stripe Connect — fiat payouts via Stripe Express onboarding.
- Tempo wallet — USDC payouts via manual wallet verification.
See Payouts for details on each option.
- Install the CLI —
curl -fsSL https://vram.supply/install.sh | sh - Register —
POST /v1/providers/registerwith your model, pricing, and endpoint URL. - Serve — the platform routes requests to your endpoint. Send heartbeats to stay online.
- Earn — 95% of revenue (5% platform fee). Payouts: daily (Stripe) or every 15 minutes (Tempo).
Payment Rails
Each provider instance declares which buyer payment rails it accepts via two independent flags: accepts_stripe and accepts_tempo. This determines which buyers can be routed to you.
| Flag | Buyer type routed to you | Prerequisite |
|---|---|---|
| accepts_stripe | Buyers using Bearer sk-... | Stripe Connect onboarding complete |
| accepts_tempo | Buyers using Payment (MPP) | Verified Tempo wallet address |
Providers who enable both rails maximise their traffic. Toggle rails via PATCH /v1/settings/provider/rails. At least one rail must remain enabled. Passkey-signup providers default to accepts_tempo = true. See Routing for how rail filtering works during provider selection.
Model Resolution
When you run vramsupply serve --model "org/model" --quant Q4_K_M, the CLI automatically finds a GGUF repository on HuggingFace, downloads the matching quantization file, and starts serving. The canonical model ID (e.g., qwen/qwen3.5-9b) is used for marketplace identity, while the resolved GGUF repo is used for file verification. You can also pass a local GGUF path directly with --model ./my-model.gguf.
Heartbeat & Health
The platform runs health checks every 60 seconds. Your agent should also send heartbeats via POST /v1/providers/heartbeat. If a provider fails a health check, it's marked offline and stops receiving requests.
Pricing
All providers set their own token pricing at registration time. Prices are specified as input_price_per_million and output_price_per_million in cents per million tokens.
- CLI agent — set via
--input-price/--output-priceflags, orVRAM_SUPPLY_INPUT_PRICE/VRAM_SUPPLY_OUTPUT_PRICEenv vars. Defaults: 100 / 200. - Browser provider — set in the pricing card on the provide page before starting.
- Mobile (iOS / Android) — set via the pricing card on the providing screen. Suggested defaults come from the model catalog.
Browser Provider
Chrome users can serve inference directly from a browser tab using Chrome's built-in Prompt API. No GPU or CLI install required. Visit the Provide page, select the browser lane, set your pricing, and click Start Serving. Keep the tab open — closing it stops serving.
Market Demand
Use GET /v1/providers/demand to see which models have demand: requests in the last 24h, online provider count, and price ranges.
Endpoint Reference
/v1/providers/registerAPI KeyRegister provider instance.
Parameters
| Name | Type | Req | Description |
|---|---|---|---|
| provider_type | string | No | 'agent' (default) or 'browser'. Default: agent |
| endpoint_url | string | Yes | Public endpoint. Required for agent providers. |
| model | string | Yes | Model ID. |
| input_price_per_million | integer | Yes | Input price (cents/M tokens). |
| output_price_per_million | integer | Yes | Output price (cents/M tokens). |
| context_length_offered | integer | Yes | Context window. Required for agent providers. |
/v1/providers/heartbeatAPI KeyProvider heartbeat.
/v1/providers/:idAPI KeyDeregister provider.
/v1/providers/demandAPI KeyMarket demand data.