Skip to main content
Vivgrid provides access to a range of powerful AI models for building enterprise-grade AI agents. We select models based on their performance, cost-effectiveness, and suitability for various tasks. Pricing is transparent and matches the rates of the original providers.

Supported Models

Coding Models

  • gpt-5.2-codex
  • gpt-5.1-codex-max
  • gpt-5.1-codex

Agent Model

  • gpt-5.2
  • gpt-5.1
  • gpt-5
  • gpt-5-mini
  • deepseek-v3.2
  • gemini-3-pro-preview
  • gemini-3-flash-preview

Model API

  • gpt-4.1
  • gpt-4o
  • gemini-2.5-pro
  • gemini-2.5-flash
  • deepseek-r1-0528
  • deepseek-v3.1

How to Set Models

You donโ€™t need to specify a model-name in your API calls. The model for your agent is managed on the backend, so switching models wonโ€™t require any code changes. To change the model for your agent, go to the Agent Settings page in the Vivgrid Console.

Pricing

Pricing is calculated in USD per 1 million tokens. The table below details the cost for input, cached, and output tokens for each model.
ModelInput TokenCached TokenOutput Token
gpt-5.2-codex$1.75$0.175$14.00
gpt-5.1-codex-max$1.25$0.125$10.00
gpt-5.1-codex$1.25$0.125$10.00
gpt-5.2$1.75$0.175$14.00
gpt-5.1$1.25$0.125$10.00
gpt-5$1.25$0.125$10.00
gpt-5-mini$0.25$0.03$2.00
gemini-3-pro-preview$2.00$0.40$12.00
gemini-3-flash-preview$0.5$0.005$3.00
deepseek-v3.2$0.28$0.03$0.42
gpt-5.1$1.25$0.125$10.00
gpt-4.1$2.00$0.50$8.00
gpt-4o$2.50$1.25$10.00
gemini-2.5-pro$1.25$0.31$10.00
gemini-2.5-flash$0.30$0.08$2.50
deepseek-r1$1.35-$5.40
deepseek-v3.1$1.14-$4.56

Capabilities

ModelContext WindowMax Output TokensTool Call
gpt-5.2-codex400,000128,000Yes
gpt-5.1-codex-max400,000128,000Yes
gpt-5.1-codex400,000128,000Yes
gpt-5.2400,000128,000Yes
gpt-5.1400,000128,000Yes
gpt-5400,000128,000Yes
gpt-5-mini272,000128,000Yes
deepseek-v3.2128,000128,000Yes
gemini-3-pro-preview1,000,00064,000Yes
gemini-3-flash-preview1,000,00064,000Yes

Service Regions & Geo-Distributed Acceleration

Vivgrid intelligently accelerates model inference by automatically routing API requests to the nearest available compute region, minimizing latency and maximizing throughput โ€” all while maintaining data-residency compliance for enterprise workloads. Unlike conventional Global deployments on public clouds (which rely on single centralized endpoints), Vivgridโ€™s geo-distributed architecture continuously synchronizes AI Tools and model states across multiple data zones, ensuring each region delivers optimized, low-latency performance.
ModelAcceleration ModeAccelerated Regions
gpt-5.2-codexโšก Geo-DistributedAccelerated in AMER, EMEA
gpt-5.1-codex-maxโšก Geo-DistributedAccelerated in AMER, EMEA
gpt-5.1-codexโšก Geo-DistributedAccelerated in AMER, EMEA, APAC
gpt-5.2โšก Geo-DistributedAccelerated in AMER, EMEA
gpt-5.1โšก Geo-DistributedAccelerated in AMER, EMEA
gpt-5โšก Geo-DistributedAccelerated in AMER, EMEA, APAC
gpt-5-miniโšก Geo-DistributedAccelerated in AMER, EMEA, APAC
deepseek-v3.2๐ŸŒ Global (Regional Host)โ€”
gemini-3-pro-preview๐ŸŒ Global (Centralized)โ€”
gemini-3-flash-preview๐ŸŒ Global (Centralized)โ€”
gpt-4.1โšก Geo-DistributedAccelerated in AMER, EMEA, APAC
gpt-4oโšก Geo-DistributedAccelerated in AMER, EMEA, APAC
gemini-2.5-pro๐ŸŒ Global (Centralized)โ€”
gemini-2.5-flash๐ŸŒ Global (Centralized)โ€”
deepseek-r1๐ŸŒ Global (Regional Host)โ€”
deepseek-v3-0324๐ŸŒ Global (Regional Host)โ€”

Key Highlights

  • Dynamic Routing โ€” Vivgrid automatically detects the userโ€™s region and routes requests to the nearest accelerated node, minimizing cross-continent latency.
  • Tool Synchronization โ€” Function-calling tools and context caches are replicated across all accelerated regions for consistent behavior.
  • Adaptive Caching โ€” Frequently accessed prompts and embeddings are regionally cached to reduce cold-start delays.
  • Seamless Fallback โ€” Traffic automatically re-routes to neighboring accelerated zones during high load or outages.

Notes on Global Models

For Global-only models (e.g., gemini-2.5-pro), the model host remains centralized under the providerโ€™s Global Standard endpoint, limiting VivGridโ€™s ability to perform regional acceleration.
In contrast, Geo-Distributed models (e.g., gpt-4o, gpt-5-mini, gpt-5) leverage Vivgridโ€™s orchestration layer โ€” delivering sub-50 ms latency worldwide through intelligent regional acceleration.