Rate Limiting

The Rate Limiting API lets you define rate limits and write code around them in your Worker.

You can use it to enforce:

Rate limits that are applied after your Worker starts, only once a specific part of your code is reached
Different rate limits for different types of customers or users (ex: free vs. paid)
Resource-specific or path-specific limits (ex: limit per API route)
Any combination of the above

The Rate Limiting API is backed by the same infrastructure that serves the Rate limiting rules that are built into the Cloudflare Web Application Firewall (WAF).

Get started

First, add a binding to your Worker that gives it access to the Rate Limiting API:

wrangler.json
wrangler.toml

{
  "main": "src/index.js",
  "unsafe": {
    "bindings": [
      {
        "name": "MY_RATE_LIMITER",
        "type": "ratelimit",
        "namespace_id": "1001",
        "simple": {
          "limit": 100,
          "period": 60
        }
      }
    ]
  }
}

main = "src/index.js"

# The rate limiting API is in open beta.
[[unsafe.bindings]]
name = "MY_RATE_LIMITER"
type = "ratelimit"
# An identifier you define, that is unique to your Cloudflare account.
# Must be an integer.
namespace_id = "1001"

# Limit: the number of tokens allowed within a given period in a single
# Cloudflare location
# Period: the duration of the period, in seconds. Must be either 10 or 60
simple = { limit = 100, period = 60 }

This binding makes the MY_RATE_LIMITER binding available, which provides a limit() method:

JavaScript
TypeScript

export default {
  async fetch(request, env) {
    const { pathname } = new URL(request.url)

    const { success } = await env.MY_RATE_LIMITER.limit({ key: pathname }) // key can be any string of your choosing
    if (!success) {
      return new Response(`429 Failure – rate limit exceeded for ${pathname}`, { status: 429 })
    }

    return new Response(`Success!`)
  }
}

interface Env {
  MY_RATE_LIMITER: any;
}

export default {
  async fetch(request, env): Promise<Response> {
    const { pathname } = new URL(request.url)

    const { success } = await env.MY_RATE_LIMITER.limit({ key: pathname }) // key can be any string of your choosing
    if (!success) {
      return new Response(`429 Failure – rate limit exceeded for ${pathname}`, { status: 429 })
    }

    return new Response(`Success!`)
  }
} satisfies ExportedHandler<Env>;

The limit() API accepts a single argument — a configuration object with the key field.

The key you provide can be any string value.
A common pattern is to define your key by combining a string that uniquely identifies the actor initiating the request (ex: a user ID or customer ID) and a string that identifies a specific resource (ex: a particular API route).

You can define and configure multiple rate limiting configurations per Worker, which allows you to define different limits against incoming request and/or user parameters as needed to protect your application or upstream APIs.

For example, here is how you can define two rate limiting configurations for free and paid tier users:

wrangler.json
wrangler.toml

{
  "main": "src/index.js",
  "unsafe": {
    "bindings": [
      {
        "name": "FREE_USER_RATE_LIMITER",
        "type": "ratelimit",
        "namespace_id": "1001",
        "simple": {
          "limit": 100,
          "period": 60
        }
      },
      {
        "name": "PAID_USER_RATE_LIMITER",
        "type": "ratelimit",
        "namespace_id": "1002",
        "simple": {
          "limit": 1000,
          "period": 60
        }
      }
    ]
  }
}

main = "src/index.js"

# Free user rate limiting
[[unsafe.bindings]]
name = "FREE_USER_RATE_LIMITER"
type = "ratelimit"
namespace_id = "1001"
simple = { limit = 100, period = 60 }

# Paid user rate limiting
[[unsafe.bindings]]
name = "PAID_USER_RATE_LIMITER"
type = "ratelimit"
namespace_id = "1002"
simple = { limit = 1000, period = 60 }

Configuration

A rate limiting binding has three settings:

namespace_id (number) - a positive integer that uniquely defines this rate limiting configuration - e.g. namespace_id = "999".
limit (number) - the limit (number of requests, number of API calls) to be applied. This is incremented when you call the limit() function in your Worker.
period (seconds) - must be 10 or 60. The period to measure increments to the limit over, in seconds.

For example, to apply a rate limit of 1500 requests per minute, you would define a rate limiting configuration as follows:

wrangler.json
wrangler.toml

{
  "unsafe": {
    "bindings": [
      {
        "name": "MY_RATE_LIMITER",
        "type": "ratelimit",
        "namespace_id": "1001",
        "simple": {
          "limit": 1500,
          "period": 60
        }
      }
    ]
  }
}

[[unsafe.bindings]]
name = "MY_RATE_LIMITER"
type = "ratelimit"
namespace_id = "1001"

# 1500 requests - calls to limit() increment this
simple = { limit = 1500, period = 60 }

Best practices

The key passed to the limit function, that determines what to rate limit on, should represent a unique characteristic of a user or class of user that you wish to rate limit.

Good choices include API keys in Authorization HTTP headers, URL paths or routes, specific query parameters used by your application, and/or user IDs and tenant IDs. These are all stable identifiers and are unlikely to change from request-to-request.
It is not recommended to use IP addresses or locations (regions or countries), since these can be shared by many users in many valid cases. You may find yourself unintentionally rate limiting a wider group of users than you intended by rate limiting on these keys.

// Recommended: use a key that represents a specific user or class of user
const url = new URL(req.url)
const userId = url.searchParams.get("userId") || ""
const { success } = await env.MY_RATE_LIMITER.limit({ key: userId })

// Not recommended:  many users may share a single IP, especially on mobile networks
// or when using privacy-enabling proxies
const ipAddress = req.headers.get("cf-connecting-ip") || ""
const { success } = await env.MY_RATE_LIMITER.limit({ key: ipAddress })

Locality

Rate limits that you define and enforce in your Worker are local to the Cloudflare location ↗ that your Worker runs in.

For example, if a request comes in from Sydney, Australia, to the Worker shown above, after 100 requests in a 60 second window, any further requests for a particular path would be rejected, and a 429 HTTP status code returned. But this would only apply to requests served in Sydney. For each unique key you pass to your rate limiting binding, there is a unique limit per Cloudflare location.

Performance

The Rate Limiting API in Workers is designed to be fast.

The underlying counters are cached on the same machine that your Worker runs in, and updated asynchronously in the background by communicating with a backing store that is within the same Cloudflare location.

This means that while in your code you await a call to the limit() method:

const { success } = await env.MY_RATE_LIMITER.limit({ key: customerId })

You are not waiting on a network request. You can use the Rate Limiting API without introducing any meaningful latency to your Worker.

Accuracy

The above also means that the Rate Limiting API is permissive, eventually consistent, and intentionally designed to not be used as an accurate accounting system.

For example, if many requests come in to your Worker in a single Cloudflare location, all rate limited on the same key, the isolate that serves each request will check against its locally cached value of the rate limit. Very quickly, but not immediately, these requests will count towards the rate limit within that Cloudflare location.

Examples

@elithrar/workers-hono-rate-limit ↗ — Middleware that lets you easily add rate limits to routes in your Hono ↗ application.
@hono-rate-limiter/cloudflare ↗ — Middleware that lets you easily add rate limits to routes in your Hono ↗ application, with multiple data stores to choose from.