gemini-3-flash-preview β API, Pricing & Context Window | Vivgrid
gemini-3-flash-preview on Vivgrid: Google's fast multimodal Gemini 3 model with a 1M-token context window and very low cached pricing.
gemini-3-flash-preview is the fast, cost-efficient member of the Gemini 3 family, accepting text, image, video, audio, and PDF within a ~1.05M-token context window. Its very low cached-input pricing makes it attractive for repeated-context workloads.
On Vivgrid it runs as a globally centralized model, reachable through the same unified API key as every other model in the catalog.
Specifications
| Provider | |
| Model ID | gemini-3-flash-preview |
| Best for | General-purpose |
| Context window | 1,048,576 tokens |
| Max output | 64,000 tokens |
| Modalities | Text, Image, Video, Audio, Pdf |
| Tool / function calling | Yes |
| Knowledge cutoff | 2025-01 |
| Acceleration | π Global (Centralized) |
Pricing
Pricing in USD per 1M tokens, matching the provider's rates.
| Input | Cached input | Output |
|---|---|---|
| $0.50 | $0.005 | $3.00 |
Quick start
Call gemini-3-flash-preview through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.
curl https://api.vivgrid.com/v1/chat/completions \
-H "Authorization: Bearer $VIVGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [
{ "role": "user", "content": "Say hello in English, Chinese and Spanish." }
],
"stream": true
}'Ideal use cases
- High-volume multimodal agents
- Repeated-context workflows that benefit from cheap cached input
- Media ingestion and summarization at scale
- Fast assistants needing large context
Related models
- gemini-3.5-flash β the newer flash generation
- gemini-3-pro-preview β the pro-tier sibling
- gemini-2.5-flash β prior flash model
gemini-3-pro-preview
gemini-3-pro-preview on Vivgrid: Google's high-end multimodal model with a 1M-token context window across text, image, video, audio and PDF.
gemini-2.5-pro
gemini-2.5-pro on Vivgrid: Google's proven multimodal model with a 1M-token context window across text, image, video, audio and PDF.