Volt documentation

Run 70B models in your customer's metro

The Sovereign Inference Cloud. OpenAI-compatible, zero egress, served in-metro. Start here.

Install the SDK, get a key, make your first request in under five minutes.

Zero egress, the sovereign tier, tiers and catalogs — how Volt works.

Streaming chat, batch embeddings, sovereign isolation, and SDK quickstarts.

The full control-plane API for Spark, Forge, and Vault.

Change the base URL and key. Your existing OpenAI client streams Llama 70B from a pod in your metro.

Zero ingress, zero egress, zero inter-pod transfer. Pin a metro, enforce the sovereign tier, and verify it client-side.

Ready to run frontier models in your metro?

Get an API key, point your OpenAI client at Volt, and serve Llama 70B in-metro with zero egress.