Fallbacks
Specify model or provider fallbacks with your Universal endpoint to handle request failures and ensure reliability.
Fallbacks are currently triggered only when a request encounters an error. We are working to expand fallback functionality to include time-based triggers, which will allow requests that exceed a predefined response time to timeout and fallback.
In the following example, a request first goes to the Workers AI Inference API. If the request fails, it falls back to OpenAI. The response header cf-aig-step
indicates which provider successfully processed the request.
- Sends a request to Workers AI Inference API.
- If that request fails, proceeds to OpenAI.
graph TD A[AI Gateway] --> B[Request to Workers AI Inference API] B -->|Success| C[Return Response] B -->|Failure| D[Request to OpenAI API] D --> E[Return Response]
You can add as many fallbacks as you need, just by adding another object in the array.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \ --header 'Content-Type: application/json' \ --data '[ { "provider": "workers-ai", "endpoint": "@cf/meta/llama-3.1-8b-instruct", "headers": { "Authorization": "Bearer {cloudflare_token}", "Content-Type": "application/json" }, "query": { "messages": [ { "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "What is Cloudflare?" } ] } }, { "provider": "openai", "endpoint": "chat/completions", "headers": { "Authorization": "Bearer {open_ai_token}", "Content-Type": "application/json" }, "query": { "model": "gpt-4o-mini", "stream": true, "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] } }]'
When using the Universal endpoint with fallbacks, the response header cf-aig-step
indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
cf-aig-step:0
– The first (primary) model was used successfully.cf-aig-step:1
– The request fell back to the second model.cf-aig-step:2
– The request fell back to the third model.- Subsequent steps – Each fallback increments the step number by 1.