Upstream provider errors & failover
How ToRouter retries across channels when an upstream provider has issues — and what you'll see when it can't help.
ToRouter sits in front of multiple upstream accounts per provider. When one channel fails transiently, the gateway tries the next eligible channel automatically. You only see a 5xx when every candidate has failed.
What triggers automatic failover
The gateway retries with a different channel when the upstream returns:
429— rate-limited at the upstream401/402/403— upstream key revoked, banned, or out of credit5xx— upstream server error or overload- Transient network errors before any response bytes arrive
What is not retried (returned to you immediately):
400— your request is malformed; another channel will reject it the same way404— model not found at upstream; same story- Anything after the first byte of a streaming response has already reached your client
For streaming requests, failover only works before the first byte is written to your client. If a stream is cut mid-response by the upstream, you'll see a truncated stream rather than a retry — there's no safe way to splice two providers' partial outputs.
What you see when failover gives up
{
"error": {
"type": "upstream_unavailable",
"message": "Service temporarily unavailable"
}
}This is usually 502 or 503. It means every channel in your group's rotation either errored out or was excluded (rate-limited, cooling down after recent failures, etc.).
What to do:
- Retry with backoff — transient upstream issues clear in seconds to minutes.
- Check the Usage dashboard or Usage details for failed-request ratios and status codes. If a provider has a broad outage (e.g. an OpenAI region), their status page and news usually reflect it too.
- Try a different model — if
gpt-5is down everywhere,claude-opus-4-7or a Qwen model in the same group may still work.
Definitive upstream errors (no failover)
Some upstream errors are passed straight through because retrying wouldn't help:
| Upstream returns | You see | Why |
|---|---|---|
400 invalid_request_error | 400 with the original message | Your request body is wrong |
404 model_not_found | 404 | Model slug isn't valid at any upstream |
413 payload_too_large | 413 | Prompt or attachment too big |
415 unsupported_media_type | 415 | Image/audio format not supported |
These mean change the request, not retry.
Diagnosing intermittent 5xx
If you see occasional 502/503 even when providers look healthy, a specific upstream account may be degraded or rate-limited. Use Usage details to compare model, channel, and error fields.