grok-4-1-fast-non-reasoning — API, Pricing & Context Window

grok-4-1-fast-non-reasoning on Vivgrid: xAI's low-latency model with a 2M-token context window and image input, via one unified API.

grok-4-1-fast-non-reasoning is the low-latency variant of xAI's Grok 4.1 fast line, optimized for quick responses without extended reasoning. It keeps the same massive 2M-token context window and image input, prioritizing speed for high-throughput workloads.

Vivgrid serves grok-4-1-fast-non-reasoning through its unified, OpenAI-compatible API and single billing surface.

Specifications

Provider	xAI
Model ID	`grok-4-1-fast-non-reasoning`
Best for	General-purpose
Context window	2,000,000 tokens
Modalities	Text, Image
Tool / function calling	Yes
Knowledge cutoff	2025-07
Acceleration	🌐 Global (Centralized)

Pricing

Pricing for grok-4-1-fast-non-reasoning matches the provider's published rates. See live pricing in the Vivgrid Console.

Quick start

Call grok-4-1-fast-non-reasoning through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.

curl https://api.vivgrid.com/v1/chat/completions \
  -H "Authorization: Bearer $VIVGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4-1-fast-non-reasoning",
    "messages": [
      { "role": "user", "content": "Say hello in English, Chinese and Spanish." }
    ],
    "stream": true
  }'

Ideal use cases

Latency-sensitive, high-volume traffic
Large-context retrieval and summarization (up to 2M tokens)
Fast classification and extraction
Workloads where speed beats deep reasoning

grok-4-1-fast-reasoning — reasoning variant
gpt-5-mini — fast, low-cost alternative
gemini-2.5-flash — large-context flash model

grok-4-1-fast-non-reasoning — API, Pricing & Context Window | Vivgrid

Specifications

Pricing

Quick start

Ideal use cases

On this page

grok-4-1-fast-non-reasoning — API, Pricing & Context Window | Vivgrid

Specifications

Pricing

Quick start

Ideal use cases

Related models

On this page