Providing

Earn by serving inference on the vram.supply network. Install the CLI agent, register your GPU, and start receiving requests.

Provider Lifecycle

Set up payouts — choose one or both:
- Passkey signup — wallet auto-derived, zero extra steps.
- Stripe Connect — fiat payouts via Stripe Express onboarding.
- Tempo wallet — USDC payouts via manual wallet verification.
See Payouts for details on each option.
Install the CLI — curl -fsSL https://vram.supply/install.sh | sh
Register — POST /v1/providers/register with your model, pricing, and endpoint URL.
Serve — the platform routes requests to your endpoint. Send heartbeats to stay online.
Earn — 95% of revenue (5% platform fee). Payouts: daily (Stripe) or every 15 minutes (Tempo).

Payment Rails

Each provider instance declares which buyer payment rails it accepts via two independent flags: accepts_stripe and accepts_tempo. This determines which buyers can be routed to you.

Flag	Buyer type routed to you	Prerequisite
accepts_stripe	Buyers using `Bearer sk-...`	Stripe Connect onboarding complete
accepts_tempo	Buyers using `Payment` (MPP)	Verified Tempo wallet address

Providers who enable both rails maximise their traffic. Toggle rails via PATCH /v1/settings/provider/rails. At least one rail must remain enabled. Passkey-signup providers default to accepts_tempo = true. See Routing for how rail filtering works during provider selection.

Model Resolution

When you run vramsupply serve --model "org/model" --quant Q4_K_M, the CLI automatically finds a GGUF repository on HuggingFace, downloads the matching quantization file, and starts serving. The canonical model ID (e.g., qwen/qwen3.5-9b) is used for marketplace identity, while the resolved GGUF repo is used for file verification. You can also pass a local GGUF path directly with --model ./my-model.gguf.

Heartbeat & Health

The platform runs health checks every 60 seconds. Your agent should also send heartbeats via POST /v1/providers/heartbeat. If a provider fails a health check, it's marked offline and stops receiving requests.

Pricing

All providers set their own token pricing at registration time. Prices are specified as input_price_per_million and output_price_per_million in cents per million tokens.

CLI agent — set via --input-price / --output-price flags, or VRAM_SUPPLY_INPUT_PRICE / VRAM_SUPPLY_OUTPUT_PRICE env vars. Defaults: 100 / 200.
Browser provider — set in the pricing card on the provide page before starting.
Mobile (iOS / Android) — set via the pricing card on the providing screen. Suggested defaults come from the model catalog.

Browser Provider

Chrome users can serve inference directly from a browser tab using Chrome's built-in Prompt API. No GPU or CLI install required. Visit the Provide page, select the browser lane, set your pricing, and click Start Serving. Keep the tab open — closing it stops serving.

Market Demand

Use GET /v1/providers/demand to see which models have demand: requests in the last 24h, online provider count, and price ranges.

Endpoint Reference

POST/v1/providers/registerAPI Key

Parameters

Name	Type	Req	Description
provider_type	string	No	'agent' (default) or 'browser'. Default: `agent`
endpoint_url	string	Yes	Public endpoint. Required for agent providers.
model	string	Yes	Model ID.
input_price_per_million	integer	Yes	Input price (cents/M tokens).
output_price_per_million	integer	Yes	Output price (cents/M tokens).
context_length_offered	integer	Yes	Context window. Required for agent providers.

POST/v1/providers/heartbeatAPI Key

Provider heartbeat.

DELETE/v1/providers/:idAPI Key

Deregister provider.

GET/v1/providers/demandAPI Key

Market demand data.