I am a...
Learn more
How it worksPricingFAQ
Account
May 17, 2026 · 11 min read · Cadence Editorial

Best API rate limiter solutions 2026

best api rate limiter — Best API rate limiter solutions 2026
Photo by [Brett Sayles](https://www.pexels.com/@brett-sayles) on [Pexels](https://www.pexels.com/photo/server-racks-on-data-center-5480781/)

Best API rate limiter solutions 2026

The best API rate limiter in 2026 is Upstash Ratelimit for serverless and edge apps (sub-10ms latency, free up to 10k commands/day), Cloudflare Rate Limiting for sites already on Cloudflare (zero-config DDoS shielding), and Kong or a custom Redis token-bucket for self-hosted APIs that need fine-grained control. AWS API Gateway throttling is the right default if you already live in AWS. Skip Nginx limit_req for anything multi-region.

Rate limiting is one of those problems that looks trivial until it isn't. A single endpoint with a setInterval counter works fine until you scale to two pods, then it's broken. By the time you need per-user limits, sliding windows, and burst tolerance across an edge network, you're picking between five categories of tool with very different trade-offs. This is the honest 2026 buyer's guide.

The 30-second answer

Pick by where your API runs:

  • Serverless / edge functions: Upstash Ratelimit (Redis-backed, edge-replicated).
  • Behind Cloudflare: Cloudflare Rate Limiting rules (no code change needed).
  • AWS-native: API Gateway usage plans plus per-key throttling.
  • Self-hosted API platform: Kong or Tyk for full gateway features; a custom Redis token-bucket if you only need rate limiting and nothing else.
  • Single-node Nginx in front of one origin: limit_req works and is free.
  • Vercel app needing per-route limits: Edge Config plus middleware plus Upstash.

Pick the algorithm first, the tool second

Most teams pick the tool before they pick the algorithm, then bend the algorithm to fit the tool. That's backwards. The four algorithms have meaningfully different behaviors under burst load, and the choice cascades into your tooling shortlist.

Token bucket

A bucket holds N tokens. Each request consumes one. Tokens refill at rate R per second. If the bucket is empty, reject.

  • Allows burst: yes (up to bucket size).
  • Memory per key: 2 numbers (count, last refill timestamp).
  • Good for: user-facing APIs where occasional bursts are fine, billing tiers ("100 requests/min, burst 200").
  • Used by: Stripe, AWS API Gateway, most modern API products.

Leaky bucket

Requests enter a queue. The queue drains at fixed rate. If the queue is full, reject (or block).

  • Allows burst: no (smooths traffic to the drain rate).
  • Memory per key: queue or counter plus last leak timestamp.
  • Good for: downstream protection (e.g. you call a third-party API at strict 10 rps).
  • Less common in user-facing limits because the queueing latency annoys users.

Fixed window

Count requests in a wall-clock window (e.g. per minute). Reset to zero at window boundary.

  • Allows burst: yes, and a nasty one. A user can fire 2x the limit by hitting the last second of one window plus the first second of the next.
  • Memory per key: 1 counter plus a window key.
  • Good for: the simplest possible implementation. Use only when the boundary burst doesn't matter (cron jobs, batch ingestion).

Sliding window (log or counter)

Either store every request timestamp and count timestamps in the last T seconds (sliding log), or interpolate between two adjacent fixed windows (sliding window counter).

  • Allows burst: no, the boundary problem is gone.
  • Memory per key: higher (sliding log), or 2 counters (sliding window counter).
  • Good for: strict per-user quotas where fairness matters. Most billing-grade limits use this.
  • Used by: Upstash Ratelimit's default slidingWindow, GitHub's API.

Match the algorithm to the contract you're enforcing. Then shortlist tools that implement it well.

The 2026 tool comparison

ToolBest forAlgorithmsDistributedPricingSetup time
Upstash RatelimitVercel, Cloudflare Workers, Next.js edgeFixed, sliding, token bucketYes (Redis multi-region)Free 10k cmds/day; pay-as-you-go $0.20/100k5 min
Cloudflare Rate LimitingAnyone behind CloudflareFixed window, sliding windowYes (every edge POP)Free 10k rules/mo; $5/mo per 1M requests after2 min (dashboard)
AWS API GatewayAWS-native APIsToken bucketYes (regional)$3.50 per M requests; throttling free10 min
Kong GatewaySelf-hosted, polyglot APIsFixed, sliding, token bucket via pluginsYes (Redis or Postgres backend)Free OSS; Konnect from $250/mo1-2 days
TykSelf-hosted with strong analyticsToken bucket, quotaYes (Redis)Free OSS; Cloud from $600/mo1-2 days
Nginx limit_reqSingle-node reverse proxyLeaky bucketNo (per-node)Free30 min
Custom Redis token bucketYou need exactly one thing and nothing elseAnything you writeYes (Redis cluster)Redis cost only1-3 days to harden
Vercel Edge Config + middleware + UpstashPer-route Next.js limits with hot-reload configSliding window (Upstash)YesEdge Config free tier + Upstash1 hour

Tool-by-tool, with honest trade-offs

Upstash Ratelimit

The default choice for anything serverless or edge in 2026. The @upstash/ratelimit SDK gives you four algorithms (fixedWindow, slidingLogs, slidingWindow, tokenBucket) backed by Upstash Redis over REST, so it works in Vercel Edge, Cloudflare Workers, Deno Deploy, and Bun without a TCP connection.

Where it wins: the REST-over-HTTPS protocol means no connection pool drama in serverless. Multi-region Redis replication keeps the rate-limit state consistent across edge POPs (with a few hundred ms of lag, fine for limits). The free tier of 10,000 commands/day covers most side projects.

Where it loses: you're paying per Redis command. At 50M requests/month with 2 commands per check, that's $200/mo just for rate limiting. Self-hosting a Redis cluster gets cheaper above that line. Also, the cross-region eventual consistency means a user can briefly exceed limits during the replication window.

Cloudflare Rate Limiting

If your traffic already hits Cloudflare (it probably does), this is the zero-effort answer. You write rules in the dashboard, pick a counter (IP, header, JA3 fingerprint, URL path), set a threshold, and Cloudflare blocks at the edge before requests ever touch your origin.

Where it wins: absorbs the cost of malicious bursts at Cloudflare's edge, not yours. No code to ship. The 2025 advanced rate limiting added sliding-window evaluation and per-account custom counters.

Where it loses: Cloudflare-only. The rules language is its own thing; complex per-user logic (e.g. plan-aware quotas read from your database) is awkward and usually still belongs at the app layer. Free plan caps at 10k rules/month evaluated, which is one busy endpoint.

AWS API Gateway throttling

API Gateway gives you account-level, stage-level, route-level, and per-API-key throttling, all token-bucket, configured per usage plan. Built in. Free with the gateway.

Where it wins: if your API already runs through API Gateway (REST or HTTP API), you get rate limiting for $0 extra. Usage plans pair cleanly with API keys for tiered SaaS pricing. CloudWatch metrics are built in.

Where it loses: the burst and rate values are coarse. You don't get per-user limits unless you mint an API key per user (operationally painful). Cold-start Lambda plus API Gateway plus throttling adds latency budget you didn't plan for. And the throttle returns 429 with no Retry-After header by default, which annoys clients.

Kong Gateway

The full-featured open-source API gateway. The rate-limiting and rate-limiting-advanced plugins support all four algorithms, multiple storage backends (Redis, Postgres, local), and consumer-aware policies.

Where it wins: if you're already running Kong for routing, auth, and transformation, the rate-limiting plugin is two lines of config. Battle-tested at high scale (Kong serves trillions of requests). The OSS edition is genuinely capable.

Where it loses: running Kong is its own job. You need a control plane, a database, a Redis cluster, and someone who understands the plugin order. For a small startup that just needs rate limits, Kong is overkill. Konnect (managed) starts at $250/mo and goes up fast.

Tyk

Kong's biggest open-source competitor. Better analytics dashboard out of the box, slightly easier developer portal.

Where it wins: the developer portal and key-management UX are nicer than Kong's. Quota-based limiting (e.g. "10,000 requests per month per key") is first-class, not a plugin.

Where it loses: smaller community, fewer plugins. If you Google a problem, you'll find a Kong answer faster than a Tyk one in 2026.

Nginx limit_req

The classic. limit_req_zone plus limit_req in your nginx.conf, leaky-bucket, in-process counters.

Where it wins: free, fast, no extra moving parts. Perfect for a single-node setup or a reverse proxy in front of one service.

Where it loses: each Nginx node counts independently. Two pods means a user can hit 2 * limit. There's no shared state. People hack around this with nginx-module-redis2 or moving to OpenResty plus Lua, at which point you've built a worse Kong. Skip for anything distributed.

Custom Redis token bucket

You write 30 lines of Lua (atomic via EVAL) or a TypeScript helper that does INCR plus EXPIRE. You own it.

Where it wins: zero vendor lock-in, free except for Redis. You can encode any business rule you want (per-tier limits, grace periods, soft warns, custom keys). The Stripe blog has a famous reference implementation that thousands of teams have copied.

Where it loses: edge cases eat weeks. Clock skew across instances, atomicity, fail-open vs fail-closed on Redis outage, observability, admin tooling to bump a user's quota. If rate limiting is not your core competency, buy.

Vercel Edge Config plus middleware plus Upstash

Vercel's idiomatic pattern: store the rate-limit config (limits per route, per plan) in Edge Config for instant global propagation, then enforce in middleware.ts using @upstash/ratelimit. Editing a limit takes effect globally in seconds without a redeploy.

Where it wins: product and ops teams can change limits without shipping code. Limits are co-located with routing, so a single middleware handles auth plus rate limiting plus geo logic.

Where it loses: locks you into Vercel for the config side. Edge Config has its own pricing curve. Same Upstash cost concerns apply at scale. The same setup choices (similar to those covered in our guide to the best deployment platforms for startups) come up the moment you outgrow Vercel.

What to do this week

  1. Pick the algorithm: sliding window for billing-grade fairness, token bucket if you want bursts, fixed window only if you don't care about boundary abuse.
  2. Match to your runtime: edge/serverless picks Upstash; AWS-native picks API Gateway; self-host picks Kong; behind Cloudflare picks Cloudflare Rate Limiting.
  3. Set fail-open vs fail-closed. If Redis is down, do you reject everyone (safe but breaks the product) or allow everyone (risky but available)? Decide before the outage.
  4. Add Retry-After and X-RateLimit-Remaining headers. Clients can back off; your support inbox stays quiet.
  5. Instrument. You want a Grafana panel of 429s per route per minute. If you can't see it, you can't tune it.

If your team is shipping fast and doesn't have someone who's done this before, the cheap move is to book a mid-tier Cadence engineer for one week ($1,000) to wire up Upstash plus middleware plus headers plus a basic Grafana panel, then hand back the runbook. Every engineer on Cadence is AI-native, vetted on Cursor / Claude Code / Copilot fluency before they unlock bookings, so the boilerplate (Lua scripts, middleware, tests) gets generated in hours instead of days. Cadence's pool of 12,800 engineers includes a long list of people who've shipped Kong, Tyk, and Upstash setups in production.

Not sure if your current stack is even worth keeping? Audit your tooling with Ship or Skip for an honest grade on what to keep, replace, or rip out. Five minutes, no signup.

Common mistakes we see weekly

  • Rate-limiting on IP only. One office NAT, one CGNAT carrier, and you've blocked a thousand users. Limit on user ID or API key first; fall back to IP only for unauthenticated routes.
  • Using fixed-window for billing-grade limits. The boundary burst will get flagged by an angry enterprise customer eventually.
  • Forgetting Retry-After. Clients retry instantly, you DDoS yourself.
  • Building before measuring. Most teams need limits on 3 routes, not 300. Look at your logs.
  • Setting limits without an exception path. When sales lands a whale who needs 10x, you want a database flag, not a config file deploy.

A weekly cadence of reviewing 429 rates and adjusting limits beats getting the initial numbers right. The same logic applies to picking the rest of your stack: see our take on Resend for transactional email and Drizzle ORM for TypeScript for similarly opinionated 2026 picks.

Who should pick what

  • Solo founder on Vercel: Upstash + middleware. Two hours, done.
  • Series A SaaS with API tiers: Upstash sliding window keyed by user ID, with limits sourced from your billing database. Add Cloudflare Rate Limiting in front for DDoS shielding.
  • Enterprise platform team: Kong or Tyk, self-hosted, with Redis backend and your own observability stack.
  • Startup behind AWS API Gateway: stick with native throttling until you outgrow it (usually around the point you need per-user limits).
  • Internal API only: Nginx limit_req is fine; don't over-engineer.

If you need to ship one of these this sprint, book a senior engineer for a week ($1,500) and you'll have rate limiting, observability, and a runbook by Friday. We pay engineers Friday for the week's work, so they're motivated to ship.

FAQ

Is Upstash Ratelimit worth the money?

For serverless and edge workloads under 20M requests/month, yes. The free tier covers small projects, and the REST-over-HTTPS protocol saves you from connection-pool headaches that kill performance on Lambda and Vercel Functions. Above 50M requests/month, self-hosting Redis gets cheaper.

Cloudflare Rate Limiting vs Upstash: which should I pick?

Cloudflare wins for DDoS-grade protection at the network edge and zero code changes. Upstash wins for application-level logic (per-user, plan-aware, custom keys). Most production setups use both: Cloudflare for the crude bot blocking, Upstash inside the app for fair quotas.

Can I use Kong for free?

Yes, Kong Gateway OSS is genuinely free and production-grade. The paid Konnect tier adds a managed control plane, advanced analytics, and the rate-limiting-advanced plugin (sliding window, multi-region sync). Most startups can run Kong OSS for years before needing the upgrade.

How does Nginx limit_req compare to a custom Redis bucket?

Nginx is faster (in-process) but per-node, so it's broken the moment you horizontally scale. A Redis-backed bucket is distributed but adds a network hop (1-3ms typically). For one-node setups, use Nginx; for anything else, use Redis or a managed service.

What's the best rate-limiting algorithm for a public API?

Sliding window counter is the 2026 default for public APIs. It avoids the fixed-window boundary burst, uses constant memory (unlike sliding logs), and matches what GitHub, Stripe, and Discord use. Upstash Ratelimit's slidingWindow is the easiest off-the-shelf implementation.

All posts