
The best API rate limiter in 2026 is Upstash Ratelimit for serverless and edge apps (sub-10ms latency, free up to 10k commands/day), Cloudflare Rate Limiting for sites already on Cloudflare (zero-config DDoS shielding), and Kong or a custom Redis token-bucket for self-hosted APIs that need fine-grained control. AWS API Gateway throttling is the right default if you already live in AWS. Skip Nginx limit_req for anything multi-region.
Rate limiting is one of those problems that looks trivial until it isn't. A single endpoint with a setInterval counter works fine until you scale to two pods, then it's broken. By the time you need per-user limits, sliding windows, and burst tolerance across an edge network, you're picking between five categories of tool with very different trade-offs. This is the honest 2026 buyer's guide.
Pick by where your API runs:
limit_req works and is free.Most teams pick the tool before they pick the algorithm, then bend the algorithm to fit the tool. That's backwards. The four algorithms have meaningfully different behaviors under burst load, and the choice cascades into your tooling shortlist.
A bucket holds N tokens. Each request consumes one. Tokens refill at rate R per second. If the bucket is empty, reject.
Requests enter a queue. The queue drains at fixed rate. If the queue is full, reject (or block).
Count requests in a wall-clock window (e.g. per minute). Reset to zero at window boundary.
Either store every request timestamp and count timestamps in the last T seconds (sliding log), or interpolate between two adjacent fixed windows (sliding window counter).
slidingWindow, GitHub's API.Match the algorithm to the contract you're enforcing. Then shortlist tools that implement it well.
| Tool | Best for | Algorithms | Distributed | Pricing | Setup time |
|---|---|---|---|---|---|
| Upstash Ratelimit | Vercel, Cloudflare Workers, Next.js edge | Fixed, sliding, token bucket | Yes (Redis multi-region) | Free 10k cmds/day; pay-as-you-go $0.20/100k | 5 min |
| Cloudflare Rate Limiting | Anyone behind Cloudflare | Fixed window, sliding window | Yes (every edge POP) | Free 10k rules/mo; $5/mo per 1M requests after | 2 min (dashboard) |
| AWS API Gateway | AWS-native APIs | Token bucket | Yes (regional) | $3.50 per M requests; throttling free | 10 min |
| Kong Gateway | Self-hosted, polyglot APIs | Fixed, sliding, token bucket via plugins | Yes (Redis or Postgres backend) | Free OSS; Konnect from $250/mo | 1-2 days |
| Tyk | Self-hosted with strong analytics | Token bucket, quota | Yes (Redis) | Free OSS; Cloud from $600/mo | 1-2 days |
Nginx limit_req | Single-node reverse proxy | Leaky bucket | No (per-node) | Free | 30 min |
| Custom Redis token bucket | You need exactly one thing and nothing else | Anything you write | Yes (Redis cluster) | Redis cost only | 1-3 days to harden |
| Vercel Edge Config + middleware + Upstash | Per-route Next.js limits with hot-reload config | Sliding window (Upstash) | Yes | Edge Config free tier + Upstash | 1 hour |
The default choice for anything serverless or edge in 2026. The @upstash/ratelimit SDK gives you four algorithms (fixedWindow, slidingLogs, slidingWindow, tokenBucket) backed by Upstash Redis over REST, so it works in Vercel Edge, Cloudflare Workers, Deno Deploy, and Bun without a TCP connection.
Where it wins: the REST-over-HTTPS protocol means no connection pool drama in serverless. Multi-region Redis replication keeps the rate-limit state consistent across edge POPs (with a few hundred ms of lag, fine for limits). The free tier of 10,000 commands/day covers most side projects.
Where it loses: you're paying per Redis command. At 50M requests/month with 2 commands per check, that's $200/mo just for rate limiting. Self-hosting a Redis cluster gets cheaper above that line. Also, the cross-region eventual consistency means a user can briefly exceed limits during the replication window.
If your traffic already hits Cloudflare (it probably does), this is the zero-effort answer. You write rules in the dashboard, pick a counter (IP, header, JA3 fingerprint, URL path), set a threshold, and Cloudflare blocks at the edge before requests ever touch your origin.
Where it wins: absorbs the cost of malicious bursts at Cloudflare's edge, not yours. No code to ship. The 2025 advanced rate limiting added sliding-window evaluation and per-account custom counters.
Where it loses: Cloudflare-only. The rules language is its own thing; complex per-user logic (e.g. plan-aware quotas read from your database) is awkward and usually still belongs at the app layer. Free plan caps at 10k rules/month evaluated, which is one busy endpoint.
API Gateway gives you account-level, stage-level, route-level, and per-API-key throttling, all token-bucket, configured per usage plan. Built in. Free with the gateway.
Where it wins: if your API already runs through API Gateway (REST or HTTP API), you get rate limiting for $0 extra. Usage plans pair cleanly with API keys for tiered SaaS pricing. CloudWatch metrics are built in.
Where it loses: the burst and rate values are coarse. You don't get per-user limits unless you mint an API key per user (operationally painful). Cold-start Lambda plus API Gateway plus throttling adds latency budget you didn't plan for. And the throttle returns 429 with no Retry-After header by default, which annoys clients.
The full-featured open-source API gateway. The rate-limiting and rate-limiting-advanced plugins support all four algorithms, multiple storage backends (Redis, Postgres, local), and consumer-aware policies.
Where it wins: if you're already running Kong for routing, auth, and transformation, the rate-limiting plugin is two lines of config. Battle-tested at high scale (Kong serves trillions of requests). The OSS edition is genuinely capable.
Where it loses: running Kong is its own job. You need a control plane, a database, a Redis cluster, and someone who understands the plugin order. For a small startup that just needs rate limits, Kong is overkill. Konnect (managed) starts at $250/mo and goes up fast.
Kong's biggest open-source competitor. Better analytics dashboard out of the box, slightly easier developer portal.
Where it wins: the developer portal and key-management UX are nicer than Kong's. Quota-based limiting (e.g. "10,000 requests per month per key") is first-class, not a plugin.
Where it loses: smaller community, fewer plugins. If you Google a problem, you'll find a Kong answer faster than a Tyk one in 2026.
limit_reqThe classic. limit_req_zone plus limit_req in your nginx.conf, leaky-bucket, in-process counters.
Where it wins: free, fast, no extra moving parts. Perfect for a single-node setup or a reverse proxy in front of one service.
Where it loses: each Nginx node counts independently. Two pods means a user can hit 2 * limit. There's no shared state. People hack around this with nginx-module-redis2 or moving to OpenResty plus Lua, at which point you've built a worse Kong. Skip for anything distributed.
You write 30 lines of Lua (atomic via EVAL) or a TypeScript helper that does INCR plus EXPIRE. You own it.
Where it wins: zero vendor lock-in, free except for Redis. You can encode any business rule you want (per-tier limits, grace periods, soft warns, custom keys). The Stripe blog has a famous reference implementation that thousands of teams have copied.
Where it loses: edge cases eat weeks. Clock skew across instances, atomicity, fail-open vs fail-closed on Redis outage, observability, admin tooling to bump a user's quota. If rate limiting is not your core competency, buy.
Vercel's idiomatic pattern: store the rate-limit config (limits per route, per plan) in Edge Config for instant global propagation, then enforce in middleware.ts using @upstash/ratelimit. Editing a limit takes effect globally in seconds without a redeploy.
Where it wins: product and ops teams can change limits without shipping code. Limits are co-located with routing, so a single middleware handles auth plus rate limiting plus geo logic.
Where it loses: locks you into Vercel for the config side. Edge Config has its own pricing curve. Same Upstash cost concerns apply at scale. The same setup choices (similar to those covered in our guide to the best deployment platforms for startups) come up the moment you outgrow Vercel.
Retry-After and X-RateLimit-Remaining headers. Clients can back off; your support inbox stays quiet.429s per route per minute. If you can't see it, you can't tune it.If your team is shipping fast and doesn't have someone who's done this before, the cheap move is to book a mid-tier Cadence engineer for one week ($1,000) to wire up Upstash plus middleware plus headers plus a basic Grafana panel, then hand back the runbook. Every engineer on Cadence is AI-native, vetted on Cursor / Claude Code / Copilot fluency before they unlock bookings, so the boilerplate (Lua scripts, middleware, tests) gets generated in hours instead of days. Cadence's pool of 12,800 engineers includes a long list of people who've shipped Kong, Tyk, and Upstash setups in production.
Not sure if your current stack is even worth keeping? Audit your tooling with Ship or Skip for an honest grade on what to keep, replace, or rip out. Five minutes, no signup.
Retry-After. Clients retry instantly, you DDoS yourself.A weekly cadence of reviewing 429 rates and adjusting limits beats getting the initial numbers right. The same logic applies to picking the rest of your stack: see our take on Resend for transactional email and Drizzle ORM for TypeScript for similarly opinionated 2026 picks.
limit_req is fine; don't over-engineer.If you need to ship one of these this sprint, book a senior engineer for a week ($1,500) and you'll have rate limiting, observability, and a runbook by Friday. We pay engineers Friday for the week's work, so they're motivated to ship.
For serverless and edge workloads under 20M requests/month, yes. The free tier covers small projects, and the REST-over-HTTPS protocol saves you from connection-pool headaches that kill performance on Lambda and Vercel Functions. Above 50M requests/month, self-hosting Redis gets cheaper.
Cloudflare wins for DDoS-grade protection at the network edge and zero code changes. Upstash wins for application-level logic (per-user, plan-aware, custom keys). Most production setups use both: Cloudflare for the crude bot blocking, Upstash inside the app for fair quotas.
Yes, Kong Gateway OSS is genuinely free and production-grade. The paid Konnect tier adds a managed control plane, advanced analytics, and the rate-limiting-advanced plugin (sliding window, multi-region sync). Most startups can run Kong OSS for years before needing the upgrade.
limit_req compare to a custom Redis bucket?Nginx is faster (in-process) but per-node, so it's broken the moment you horizontally scale. A Redis-backed bucket is distributed but adds a network hop (1-3ms typically). For one-node setups, use Nginx; for anything else, use Redis or a managed service.
Sliding window counter is the 2026 default for public APIs. It avoids the fixed-window boundary burst, uses constant memory (unlike sliding logs), and matches what GitHub, Stripe, and Discord use. Upstash Ratelimit's slidingWindow is the easiest off-the-shelf implementation.