Syntax
model accepts a string (1 model) or an array (1 to 8 ordered candidates):
When fallback triggers
| Case | Class | Fallback |
|---|---|---|
| HTTP 429 rate limit | provider | ✅ Yes |
| HTTP 500/502/503 provider down | provider | ✅ Yes |
| HTTP 408/504 timeout | provider | ✅ Yes |
| Network reset / connection refused | provider | ✅ Yes |
| Context window exceeded | provider | ✅ Yes |
| Content policy rejected the prompt | provider | ✅ Yes |
| HTTP 401/403 invalid auth | user | ❌ No |
| HTTP 400 malformed request | user | ❌ No |
| HTTP 402 insufficient balance | user | ❌ No |
| Unknown model | user | ❌ No |
Pre-flight skip
If a candidate doesn’t support a required capability (zdr: true, response_format) or is blocked by the org’s ZDR config, it gets skipped rather than failing. Reasons appear in skipped:
zdr_not_verified— no verified ZDR policyzdr_org_required— org requires ZDR and candidate isn’t verifiedstructured_outputs_not_supported— no structured outputs supportmodel_not_found,no_adapter— catalog or configuration
Successful response
When all fail
Pricing
Charges go againstgeekhub.final_model. Failed attempts do not generate token charges to the user but appear in /dashboard/usage with statusCode ≠ 200.
Streaming
Withstream: true, fallback only works if failure occurs before the first chunk. Once your client starts receiving tokens, switching models isn’t possible; errors are emitted as SSE events and the stream aborts.