deepseek-v4-flash — API, Pricing & Context Window

deepseek-v4-flash on Vivgrid: DeepSeek's fast, ultra-affordable model with a 1M-token context window and up to 384K output tokens.

deepseek-v4-flash is the fast, ultra-affordable member of the DeepSeek V4 family. It keeps the line's standout 1M-token context window and 384K-token max output while pricing input and output tokens at a fraction of frontier models.

Vivgrid serves deepseek-v4-flash through its unified, OpenAI-compatible API, making it a compelling default for high-volume, cost-sensitive workloads.

Specifications

Provider	DeepSeek
Model ID	`deepseek-v4-flash`
Best for	General-purpose
Context window	1,000,000 tokens
Max output	384,000 tokens
Modalities	Text
Tool / function calling	Yes
Knowledge cutoff	2025-05
Acceleration	🌐 Global (Centralized)

Pricing

Pricing in USD per 1M tokens, matching the provider's rates.

Input	Cached input	Output
$0.15	$0.03	$0.28

Quick start

Call deepseek-v4-flash through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.

curl https://api.vivgrid.com/v1/chat/completions \
  -H "Authorization: Bearer $VIVGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      { "role": "user", "content": "Say hello in English, Chinese and Spanish." }
    ],
    "stream": true
  }'

Ideal use cases

Very high-volume, cost-sensitive agent traffic
Long-context summarization and extraction
Large-output generation at minimal cost
First-pass steps in multi-model pipelines

deepseek-v4-pro — the flagship V4 model
deepseek-v3.2 — prior-generation model
gpt-5.4-nano — comparable ultra-low-cost option

deepseek-v4-flash — API, Pricing & Context Window | Vivgrid

Specifications

Pricing

Quick start

Ideal use cases

On this page

deepseek-v4-flash — API, Pricing & Context Window | Vivgrid

Specifications

Pricing

Quick start

Ideal use cases

Related models

On this page