grok-4-1-fast-non-reasoning β API, Pricing & Context Window | Vivgrid
grok-4-1-fast-non-reasoning on Vivgrid: xAI's low-latency model with a 2M-token context window and image input, via one unified API.
grok-4-1-fast-non-reasoning is the low-latency variant of xAI's Grok 4.1 fast line, optimized for quick responses without extended reasoning. It keeps the same massive 2M-token context window and image input, prioritizing speed for high-throughput workloads.
Vivgrid serves grok-4-1-fast-non-reasoning through its unified, OpenAI-compatible API and single billing surface.
Specifications
| Provider | xAI |
| Model ID | grok-4-1-fast-non-reasoning |
| Best for | General-purpose |
| Context window | 2,000,000 tokens |
| Modalities | Text, Image |
| Tool / function calling | Yes |
| Knowledge cutoff | 2025-07 |
| Acceleration | π Global (Centralized) |
Pricing
Pricing for grok-4-1-fast-non-reasoning matches the provider's published rates. See live pricing in the Vivgrid Console.
Quick start
Call grok-4-1-fast-non-reasoning through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.
curl https://api.vivgrid.com/v1/chat/completions \
-H "Authorization: Bearer $VIVGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "grok-4-1-fast-non-reasoning",
"messages": [
{ "role": "user", "content": "Say hello in English, Chinese and Spanish." }
],
"stream": true
}'Ideal use cases
- Latency-sensitive, high-volume traffic
- Large-context retrieval and summarization (up to 2M tokens)
- Fast classification and extraction
- Workloads where speed beats deep reasoning
Related models
- grok-4-1-fast-reasoning β reasoning variant
- gpt-5-mini β fast, low-cost alternative
- gemini-2.5-flash β large-context flash model