claude-haiku-4-5 β API, Pricing & Context Window | Vivgrid
claude-haiku-4-5 on Vivgrid: Anthropic's fast, low-cost Haiku model with a 200K context window, Messages API, and geo-distributed acceleration.
claude-haiku-4-5 is Anthropic's fastest and most affordable current model, designed for high-volume, latency-sensitive workloads that still benefit from Claude's quality. It offers a 200K-token context window on the Messages API.
Vivgrid accelerates claude-haiku-4-5 across AMER and EMEA, making it an excellent default for real-time assistants and large-scale agent fleets.
Specifications
| Provider | Anthropic |
| Model ID | claude-haiku-4-5 |
| Best for | Coding |
| Context window | 200,000 tokens |
| Max output | 64,000 tokens |
| Modalities | Text, Image |
| Tool / function calling | Yes |
| Knowledge cutoff | 2025-02 |
| Acceleration | β‘ Geo-Distributed β AMER, EMEA |
Pricing
Pricing in USD per 1M tokens, matching the provider's rates.
| Input | Cached input | Output |
|---|---|---|
| $1.00 | $0.10 | $5.00 |
Quick start
Call claude-haiku-4-5 through Vivgrid's unified, OpenAI-compatible endpoint. Get an API key from the Vivgrid Console.
curl https://api.vivgrid.com/v1/messages \
-H "Authorization: Bearer $VIVGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-haiku-4-5",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Say hello in English, Chinese and Spanish." }
],
"stream": true
}'Ideal use cases
- Real-time chat and assistant experiences
- High-throughput classification and extraction
- Cost-sensitive tool-calling agents
- Fast first-pass steps in larger agent pipelines
Related models
- claude-sonnet-4-6 β a step up in capability
- claude-opus-4-7 β the frontier Opus model
- gpt-5-mini β comparable fast, low-cost option
claude-sonnet-4-6
claude-sonnet-4-6 on Vivgrid: Anthropic's balanced Sonnet model with a 1M-token context window, Messages API, and geo-distributed acceleration.
gpt-5.5
Run OpenAI's gpt-5.5 on Vivgrid: a 1.05M-token flagship coding model for agents and CLIs, with geo-distributed acceleration and a unified API.