POST /v1/chat/completions
API Reference
POST /v1/chat/completions
Chat response generation with any model
POST
POST /v1/chat/completions
Request body
The namespaced model ID, e.g.
anthropic/claude-sonnet-4-6. See Chat models.Conversation messages.
Between 0 and 2. Higher = more creative, lower = more deterministic.
Between 0 and 1. Nucleus sampling. Alternative to
temperature.Maximum tokens to generate. Default varies by model.
If
true, responds with Server-Sent Events. See Streaming section below.Sequences that end generation.
Response (non-streaming)
Your
request_id (format req_<24hex>). Useful for tracing.Always
"chat.completion".Unix timestamp.
The namespaced model id (e.g.
anthropic/claude-sonnet-4-6).Array with a single element (
n > 1 not supported yet).Tokens consumed. Charged at request completion.
Streaming
For real-time responses, send"stream": true. You’ll receive Server-Sent Events:
data: is a JSON with a delta.content that is the next text fragment (can be a word, a syllable, or even a single character).
TypeScript parser
Examples per provider
Common errors
See Errors for the full catalog. The most frequent in chat:400 invalid_request_error— malformed body (Zod tells you the field inmessage)402 insufficient_balance— no balance404 model_not_found— invalid model id (probably missing namespace)502 provider_unavailable— provider bounced (sometimes it’s a rejected prompt)