Fallbacks
Specify model or provider fallbacks with your Universal endpoint to handle request failures and ensure reliability.
Fallbacks are currently triggered only when a request encounters an error. We are working to expand fallback functionality to include time-based triggers, which will allow requests that exceed a predefined response time to timeout and fallback.
In the following example, a request first goes to the Workers AI Inference API. If the request fails, it falls back to OpenAI. The response header cf-aig-step indicates which provider successfully processed the request.
- Sends a request to Workers AI Inference API.
- If that request fails, proceeds to OpenAI.
graph TD
A[AI Gateway] --> B[Request to Workers AI Inference API]
B -->|Success| C[Return Response]
B -->|Failure| D[Request to OpenAI API]
D --> E[Return Response]
You can add as many fallbacks as you need, just by adding another object in the array.
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \ --header 'Content-Type: application/json' \ --data '[ { "provider": "workers-ai", "endpoint": "@cf/meta/llama-3.1-8b-instruct", "headers": { "Authorization": "Bearer {cloudflare_token}", "Content-Type": "application/json" }, "query": { "messages": [ { "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "What is Cloudflare?" } ] } }, { "provider": "openai", "endpoint": "chat/completions", "headers": { "Authorization": "Bearer {open_ai_token}", "Content-Type": "application/json" }, "query": { "model": "gpt-4o-mini", "stream": true, "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] } }]'When using the Universal endpoint with fallbacks, the response header cf-aig-step indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
cf-aig-step:0– The first (primary) model was used successfully.cf-aig-step:1– The request fell back to the second model.cf-aig-step:2– The request fell back to the third model.- Subsequent steps – Each fallback increments the step number by 1.