Security

Cloudflare Rate Limiting: Production Patterns for APIs

How to configure Cloudflare Rate Limiting for production: per-endpoint thresholds, advanced rules, login and OTP protection, webhook IP exemptions.

Engineering Team
9 min read

Production Cloudflare Rate Limiting needs three things to work without breaking real users: per-endpoint thresholds (not a single global rule), a Skip rule above everything for payment provider webhook IPs, and one to two weeks in log mode before flipping to block. The default global “200 requests per minute per IP” suggestion every guide repeats catches almost no real abuse and breaks legitimate burst traffic. Endpoint-specific rules tuned against your real traffic are the only configuration that holds up under attack.

This is the deep guide for tuning Cloudflare Rate Limiting on a production site with a login, an OTP, a payment flow, and an API. Every threshold below is a starting point grounded in real deployments; tune them against your own log data over the first month.

What is the difference between Cloudflare Rate Limiting tiers?

Cloudflare ships rate limiting in three tiers with different capabilities:

  • Free and Pro plans: previous-generation Rate Limiting rules, IP-based only, limited to the dashboard UI for configuration.
  • Business plan: advanced Rate Limiting rules with full expression language support, custom characteristics (cookies, headers, query parameters), and longer time windows.
  • Enterprise plan: same advanced rules plus higher rule limits, longer time windows up to 1 hour, and cf.bot_management.score integration.

The capability difference matters. On Pro you can rate-limit by IP. On Business you can rate-limit by (IP, username) or (IP, session cookie), which is the only way to stop credential stuffing properly. For any serious consumer or B2B platform, Business plan or higher is the minimum. The Cloudflare advanced rate limiting docs cover the expression syntax.

How does Cloudflare Rate Limiting actually work?

A rate limit rule has four parts: a match expression (which requests count), a characteristic (what to count by, e.g. IP or IP+cookie), a threshold (requests per time window), and an action (block, managed challenge, JS challenge, log).

Cloudflare evaluates the match expression on every incoming request. If it matches, the request increments a counter scoped to the configured characteristic. When the counter exceeds the threshold within the time window, the action fires for that characteristic until the window resets.

The atomic fact worth banking: rate limit characteristics are independent counters. Rate-limiting /api/login by IP and by username creates two counters per request. A single IP testing many usernames will trip the IP counter; a single username attacked from many IPs will trip the username counter. You usually want both.

What rate limits should I set on login and authentication endpoints?

The standard production budget for consumer auth flows:

  • Login by IP: 5 failed attempts per IP per 10 minutes. Action: managed challenge.
  • Login by username: 10 failed attempts per username per hour. Action: block, then unlock by support flow.
  • Password reset by account: 3 per account per hour. Action: block.
  • Password reset by IP: 10 per IP per hour. Action: managed challenge.
  • Signup by IP: 3 per IP per hour. Action: managed challenge. Tighter if fraud signups are a real problem.
  • OTP request by phone number: 3 per phone per 5 minutes, 10 per phone per hour. Action: block. This is the rule that protects your SMS bill.
  • OTP request by IP: 5 per IP per 5 minutes. Action: managed challenge.
  • OTP verify by phone: 5 attempts per phone per OTP cycle. Action: block, force resend.

These thresholds assume a B2C platform with regular consumer traffic patterns. B2B platforms with bursty corporate logins (Monday morning) need looser thresholds. Internal admin tools with five users can run much tighter limits.

Use IP+username as the characteristic on the IP login rule when you can. This catches credential stuffing (one IP, many usernames) while letting a real user retry their own password without burning the global IP counter.

How do I exempt payment webhook IPs from rate limits?

Always with a Skip rule placed above every rate limit, scoped to the webhook path and the provider’s published IP ranges.

(http.request.uri.path eq "/api/webhooks/payment") and (ip.src in {payment_provider_cidrs})

Action: Skip remaining rate-limit rules. Position: top of the rate limit rule order.

Payment providers (Stripe, Checkout.com, HyperPay, Tap, Mada, Adyen) call your webhook endpoints back at unpredictable rates, especially during settlement windows. A generic IP-based rate limit will eventually drop a settlement callback, and you will then have a reconciliation problem with real money. The Skip rule prevents this.

Audit the provider IP ranges quarterly. They do change. Most providers publish their CIDR list in documentation; some require subscribing to a webhook for IP updates.

Production threshold library

Per-endpoint thresholds for a typical consumer-and-B2B platform with bookings, payments, and an API. All are starting points.

Authentication and account

  • Login: 5/IP/10min, 10/username/hour
  • Password reset: 3/account/hour, 10/IP/hour
  • Signup: 3/IP/hour
  • OTP request: 3/phone/5min, 10/phone/hour, 5/IP/5min
  • OTP verify: 5/phone/cycle

Booking and reservation

  • Booking create: 20/session/hour
  • Booking modify: 10/session/hour
  • Booking cancel: 5/session/hour
  • Booking search: 60/session/min (loose, this is browsing)

Payment

  • Payment create: 10/session/10min
  • Payment confirm: 5/session/10min
  • Card add: 5/account/hour
  • 3DS challenge: 10/session/hour
  • Webhook: SKIP (provider IPs only)

Coupon and promo

  • Coupon apply: 10/session/hour
  • Gift card lookup: 5/IP/min, 30/IP/hour
  • Promo code validate: 10/session/hour

API

  • Public read endpoints: 120/IP/min
  • Authenticated read endpoints: 600/token/min
  • Write endpoints: 30/token/min
  • Bulk endpoints: 10/token/min
  • Search API: 60/session/min

Admin and ops

  • Admin login: 3/IP/10min (very tight)
  • Admin actions: 100/token/min (looser, authenticated)

Tune all of these after one month of log mode. The right thresholds for your platform are the 99th percentile of your real users’ behaviour, rounded up.

What action should I use: block, challenge, or log?

The same decision matrix as custom rules:

  • Block when there is no legitimate burst pattern (OTP request by phone, login by username). Hard limits.
  • Managed challenge when bursts can be legitimate but are usually not (login by IP, signup by IP). Real users solve a challenge.
  • JavaScript challenge rarely. Worse UX than managed challenge.
  • Log for every new rule, for at least one week, before promoting to block or challenge.

The decision matters most for endpoints where conversion drops on a challenge. Checkout pages, signup flows, and OTP verify pages are conversion-sensitive. Use managed challenge there, not block. Block on the underlying abuse signal (per-phone, per-account) where conversion is not at risk.

Rate Limiting plus WAF custom rules: how they combine

Rate Limiting catches volumetric abuse. WAF custom rules catch pattern-based abuse. They overlap deliberately.

Example: credential stuffing. A WAF custom rule blocks login attempts with no cookie (catches naive bots). Rate Limiting catches the more sophisticated bots that solved the cookie problem but still send 100 attempts from one IP. Together they cover both ends of the sophistication spectrum.

For the full WAF custom-rule library that pairs with these rate limits, see our Cloudflare WAF custom rules cookbook. For the deployment order across managed rules, custom rules, rate limits, and bot management, see our Cloudflare WAF setup guide for booking and payment platforms.

How long should I run in log mode?

Standard: one to two weeks. Less than that and you have not seen the full weekly cycle (Monday morning, payroll day, weekend, marketing email day, month-end statements). Each cycle surfaces different false-positive patterns.

Exception: active attack. If you are taking credential stuffing or payment abuse right now, flip the relevant rule to block immediately and triage false positives reactively from the block log. The two-week period is for calm rollout, not for incident response.

After the first month, revisit thresholds. Look at the 95th and 99th percentile of legitimate user behaviour. Set thresholds at the 99th percentile rounded up. This catches abuse without ever bothering a real user.

How do I handle false positives from rate limiting?

Triage workflow:

  1. Open Security Events, filter by the user’s IP or session
  2. Confirm which rule blocked them and what their counter was at when blocked
  3. Decide: was this real abuse (their account was compromised, automation was running on their behalf) or a real user who legitimately needed that rate?
  4. If real user with legitimate need: raise the threshold for that endpoint, or change the characteristic to be more specific (IP+session instead of just IP)
  5. Do not turn the rule off. Tune it.

The mistake to avoid: do not loosen the global threshold to fix one user’s false positive. That dilutes protection for everyone. Either change the characteristic (more specific counter), raise the threshold by 20% and watch, or add a narrow Skip rule for the legitimate edge case (a partner integration with high volume).

Common mistakes to avoid

Using one global IP-based rate limit for the whole site. Different endpoints have different abuse profiles. A 200/IP/min limit lets credential stuffing through and breaks legitimate booking search bursts. Per-endpoint rules.

Forgetting to scope the OTP rule by phone number. An IP-only OTP limit lets one phone number request thousands of OTPs from rotating IPs. Always include the phone number as part of the characteristic. The SMS bill depends on it.

Setting payment provider webhook rate limits. Even if the Skip rule is in place, a Block rule positioned above it overrides. Order matters; Skip first, Block after.

Picking thresholds from someone else’s blog. Including this one. Use these as starting points, then tune against your own user behaviour data after a month.

Skipping log mode. Two weeks. Always. The cost of a missed false positive at checkout is much higher than the cost of two weeks of looser protection.

Not retaining the logs. Cloudflare Security Events shows the last 24 hours by default. For pattern analysis and incident investigation, push the logs to S3, BigQuery, or your SIEM via Logpush.

For the broader WAF setup that rate limiting sits inside, see our Cloudflare WAF setup guide for booking and payment platforms. For the custom-rule library that pairs with these rate limits, see our Cloudflare WAF custom rules cookbook. For bot management as the third layer of the abuse-defence stack, see our Super Bot Fight Mode tuning guide. For an end-to-end deployment in production, the Saudi mobility platform case study shows the rate-limit thresholds running under real attack traffic.


Need Help Tuning Cloudflare Rate Limiting for Your Platform?

The thresholds above are a starting point. Tuning them against your actual user behaviour, ordering them correctly alongside WAF custom rules, exempting partner integrations and payment webhooks, and handing the configuration to your team in a maintainable state is the actual work.

Tasrie IT Services provides comprehensive Cloudflare managed services to help you:

  • Configure Rate Limiting per endpoint with thresholds derived from your real user behaviour, not vendor defaults
  • Stop credential stuffing, OTP cost abuse, coupon enumeration, and payment API abuse without breaking legitimate burst traffic
  • Coordinate rate limits with WAF custom rules and bot management so the three layers cover overlapping threats without conflicting

We have tuned rate limiting for booking, e-commerce, fintech, and B2B platforms across Saudi Arabia, the UAE, and the UK. For broader cybersecurity strategy, see our cybersecurity services.

Talk to our team about your Cloudflare Rate Limiting setup

E

Engineering Team

Published on June 8, 2026

Ready to get started?

Concerned about security?

We help teams implement security best practices across their infrastructure and applications.

Get started
Chat with real humans
Chat on WhatsApp