Models
deepseek-v4-flash β API, Pricing & Context Window | Vivgrid
deepseek-v4-flash on Vivgrid: DeepSeek's fast, ultra-affordable model with a 1M-token context window and up to 384K output tokens.
deepseek-v4-flash is the fast, ultra-affordable member of the DeepSeek V4 family. It keeps the line's standout 1M-token context window and 384K-token max output while pricing input and output tokens at a fraction of frontier models.
Vivgrid serves deepseek-v4-flash through its unified, OpenAI-compatible API, making it a compelling default for high-volume, cost-sensitive workloads.
Specifications
| Provider | DeepSeek |
| Model ID | deepseek-v4-flash |
| Best for | General-purpose |
| Context window | 1,000,000 tokens |
| Max output | 384,000 tokens |
| Modalities | Text |
| Tool / function calling | Yes |
| Knowledge cutoff | 2025-05 |
| Acceleration | π Global (Centralized) |
Pricing
Pricing in USD per 1M tokens, matching the provider's rates.
| Input | Cached input | Output |
|---|---|---|
| $0.15 | $0.03 | $0.28 |
Quick start
Call deepseek-v4-flash through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.
curl https://api.vivgrid.com/v1/chat/completions \
-H "Authorization: Bearer $VIVGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{ "role": "user", "content": "Say hello in English, Chinese and Spanish." }
],
"stream": true
}'Ideal use cases
- Very high-volume, cost-sensitive agent traffic
- Long-context summarization and extraction
- Large-output generation at minimal cost
- First-pass steps in multi-model pipelines
Related models
- deepseek-v4-pro β the flagship V4 model
- deepseek-v3.2 β prior-generation model
- gpt-5.4-nano β comparable ultra-low-cost option