Models

Vivgrid provides access to a range of powerful AI models for building enterprise-grade AI agents. We select models based on their performance, cost-effectiveness, and suitability for various tasks. Pricing is transparent and matches the rates of the original providers.

Supported Models

Coding Models

gpt-5.2-codex
gpt-5.1-codex-max
gpt-5.1-codex

Agent Model

gpt-5.2
gpt-5.1
gpt-5
gpt-5-mini
deepseek-v3.2
gemini-3-pro-preview
gemini-3-flash-preview

Model API

gpt-4.1
gpt-4o
gemini-2.5-pro
gemini-2.5-flash
deepseek-r1-0528
deepseek-v3.1

How to Set Models

You don’t need to specify a model-name in your API calls. The model for your agent is managed on the backend, so switching models won’t require any code changes. To change the model for your agent, go to the Agent Settings page in the Vivgrid Console.

Pricing

Pricing is calculated in USD per 1 million tokens. The table below details the cost for input, cached, and output tokens for each model.

Model	Input Token	Cached Token	Output Token
gpt-5.2-codex	$1.75	$0.175	$14.00
gpt-5.1-codex-max	$1.25	$0.125	$10.00
gpt-5.1-codex	$1.25	$0.125	$10.00
gpt-5.2	$1.75	$0.175	$14.00
gpt-5.1	$1.25	$0.125	$10.00
gpt-5	$1.25	$0.125	$10.00
gpt-5-mini	$0.25	$0.03	$2.00
gemini-3-pro-preview	$2.00	$0.40	$12.00
gemini-3-flash-preview	$0.5	$0.005	$3.00
deepseek-v3.2	$0.28	$0.03	$0.42
gpt-5.1	$1.25	$0.125	$10.00
gpt-4.1	$2.00	$0.50	$8.00
gpt-4o	$2.50	$1.25	$10.00
gemini-2.5-pro	$1.25	$0.31	$10.00
gemini-2.5-flash	$0.30	$0.08	$2.50
deepseek-r1	$1.35	-	$5.40
deepseek-v3.1	$1.14	-	$4.56

Capabilities

Model	Context Window	Max Output Tokens	Tool Call
gpt-5.2-codex	400,000	128,000	Yes
gpt-5.1-codex-max	400,000	128,000	Yes
gpt-5.1-codex	400,000	128,000	Yes
gpt-5.2	400,000	128,000	Yes
gpt-5.1	400,000	128,000	Yes
gpt-5	400,000	128,000	Yes
gpt-5-mini	272,000	128,000	Yes
deepseek-v3.2	128,000	128,000	Yes
gemini-3-pro-preview	1,000,000	64,000	Yes
gemini-3-flash-preview	1,000,000	64,000	Yes

Service Regions & Geo-Distributed Acceleration

Vivgrid intelligently accelerates model inference by automatically routing API requests to the nearest available compute region, minimizing latency and maximizing throughput — all while maintaining data-residency compliance for enterprise workloads. Unlike conventional Global deployments on public clouds (which rely on single centralized endpoints), Vivgrid’s geo-distributed architecture continuously synchronizes AI Tools and model states across multiple data zones, ensuring each region delivers optimized, low-latency performance.

Model	Acceleration Mode	Accelerated Regions
gpt-5.2-codex	⚡ Geo-Distributed	Accelerated in AMER, EMEA
gpt-5.1-codex-max	⚡ Geo-Distributed	Accelerated in AMER, EMEA
gpt-5.1-codex	⚡ Geo-Distributed	Accelerated in AMER, EMEA, APAC
gpt-5.2	⚡ Geo-Distributed	Accelerated in AMER, EMEA
gpt-5.1	⚡ Geo-Distributed	Accelerated in AMER, EMEA
gpt-5	⚡ Geo-Distributed	Accelerated in AMER, EMEA, APAC
gpt-5-mini	⚡ Geo-Distributed	Accelerated in AMER, EMEA, APAC
deepseek-v3.2	🌐 Global (Regional Host)	—
gemini-3-pro-preview	🌐 Global (Centralized)	—
gemini-3-flash-preview	🌐 Global (Centralized)	—
gpt-4.1	⚡ Geo-Distributed	Accelerated in AMER, EMEA, APAC
gpt-4o	⚡ Geo-Distributed	Accelerated in AMER, EMEA, APAC
gemini-2.5-pro	🌐 Global (Centralized)	—
gemini-2.5-flash	🌐 Global (Centralized)	—
deepseek-r1	🌐 Global (Regional Host)	—
deepseek-v3-0324	🌐 Global (Regional Host)	—

Key Highlights

Dynamic Routing — Vivgrid automatically detects the user’s region and routes requests to the nearest accelerated node, minimizing cross-continent latency.
Tool Synchronization — Function-calling tools and context caches are replicated across all accelerated regions for consistent behavior.
Adaptive Caching — Frequently accessed prompts and embeddings are regionally cached to reduce cold-start delays.
Seamless Fallback — Traffic automatically re-routes to neighboring accelerated zones during high load or outages.

Notes on Global Models

For Global-only models (e.g., gemini-2.5-pro), the model host remains centralized under the provider’s Global Standard endpoint, limiting VivGrid’s ability to perform regional acceleration.
In contrast, Geo-Distributed models (e.g., gpt-4o, gpt-5-mini, gpt-5) leverage Vivgrid’s orchestration layer — delivering sub-50 ms latency worldwide through intelligent regional acceleration.

Get Started

Tutorials

Tools Marketplace

Supported Models

Coding Models

Agent Model

Model API

How to Set Models

Pricing

Capabilities

Service Regions & Geo-Distributed Acceleration

Key Highlights

Notes on Global Models

Get Started

Tutorials

Tools Marketplace

​Supported Models

​Coding Models

​Agent Model

​Model API

​How to Set Models

​Pricing

​Capabilities

​Service Regions & Geo-Distributed Acceleration

​Key Highlights

​Notes on Global Models

Supported Models

Coding Models

Agent Model

Model API

How to Set Models

Pricing

Capabilities

Service Regions & Geo-Distributed Acceleration

Key Highlights

Notes on Global Models