LLM APIs
LLM API pricing
Hosted model inference priced per million tokens. Sort any column; filter by context window, input price or capability. Building an AI app? See the full stack →
9 models
| Model ↕ | Input ↑ | Output ↕ |
|---|---|---|
| $0.075 | $0.3 | |
| $0.15 | $0.6 | |
| mistral-small· Mistral AI | $0.2 | $0.6 |
| deepseek-chat· DeepSeek | $0.27 | $1.1 |
| $0.8 | $4 | |
| $1.25 | $5 | |
| mistral-large· Mistral AI | $2 | $6 |
| $2.5 | $10 | |
| $3 | $15 |
Frequently asked questions
How is LLM API pricing calculated?+
Models are priced per million tokens, split into input (your prompt) and output (the response). Output is usually 3–5× the input price, so the blended cost depends on your prompt/response ratio.
What is a context window?+
The context window is the maximum number of tokens (input + output) a model can consider at once. Larger windows (200K–2M) let you pass more documents or history, but can cost more per call.
Which LLM API is cheapest?+
Sort the Input or Output column above. Budget models like GPT-4o-mini, Gemini Flash and DeepSeek are the cheapest per token; frontier models (GPT-4o, Claude Sonnet) cost more but are more capable.