
To secure a SaaS API endpoint, stack defenses in layers: TLS in transit, authenticated sessions or JWTs at the edge, per-object authorization in the handler, schema-validated input with Zod, rate limits per identity, strict CORS, and an immutable audit log of every state change. No single control protects you. The first six lines of your Next.js route handler decide whether the next breach is yours.
Customers ship raw HTTP clients now. Half your API traffic is no longer your own React frontend; it is a Zapier flow, a customer's Python script, an internal AI agent calling tools through MCP. Browser-only assumptions (same-origin cookies, CSRF tokens, a single login form) cover maybe 40 percent of real calls.
A leaked API token used by an agent runs at 50 requests per second instead of one click per minute. A BOLA bug that hid behind UI checks is now hit by every customer integration test. LLM-generated client code is excellent at finding mass-assignment bugs because it sends every field it can guess. Security has to live in the API, not the app shell.
Most teams reach for a middleware stack: a CORS plugin, a JWT verifier, maybe express-rate-limit, and "we'll validate the body later." It works fine when there are three endpoints. It collapses around endpoint 40, when the team realizes the authorization check is copy-pasted into every handler, half of them check the wrong field, and nobody knows which routes are actually exposed to unauthenticated traffic.
The failure mode is consistent. The OWASP API Security Top 10 puts Broken Object-Level Authorization (BOLA) at number one for the third release in a row, because the bug is structural: route-level auth says "this user can hit /api/invoices/:id," but the handler forgets to check that the invoice belongs to that user. Both Optus (2022) and the more recent Authy enumeration leak came down to this.
Treat the list below as a checklist for every endpoint, not a one-time setup. Each layer assumes the one above it failed.
Terminate TLS 1.3 at your edge (Vercel, Cloudflare, Fly). Set HSTS with a one-year max-age and preload. Reject plaintext HTTP at the load balancer instead of redirecting; redirects leak the first request's URL and any query-string tokens.
Internal service-to-service traffic gets mTLS or a service mesh (Tailscale, Linkerd). The "internal network is trusted" assumption was killed by the 2024 Snowflake credential incident; act accordingly.
Browsers get HttpOnly, Secure, SameSite=Lax session cookies signed with a rotating key. Server-to-server clients get scoped API keys (prefix them, like Stripe's sk_live_, so leaks are greppable on GitHub). Mobile and third-party integrators get short-lived JWTs (15 minutes) with refresh tokens stored in a revocation table.
Never mix the models on the same route. A handler that accepts both a cookie and a bearer token is a CSRF vector waiting to happen. Use libraries that have done this work: Auth.js (formerly NextAuth) for sessions, Clerk or WorkOS for B2B SSO, Hanko for passkeys.
Role-based access control answers "can this user call this route." That belongs in middleware. Attribute-based access control answers "can this user touch this row," and it must run inside the handler with the actual record loaded.
Postgres Row-Level Security is the cleanest pattern we've seen. Supabase made it mainstream; you can run the same CREATE POLICY statements on a vanilla Postgres database. RLS moves the check next to the data, which means a forgotten WHERE tenant_id = $1 cannot leak rows even if the handler is buggy. Pair it with a per-request session variable (SET LOCAL app.user_id = '...') and your worst-case BOLA is a 403 instead of a breach. The same logic underpins our multi-tenant Postgres schema guide, which goes deeper on tenant isolation patterns.
Every request body, query param, and path param gets parsed against a Zod schema before the handler runs. Reject on parse failure with a 400 and a structured error. Three things this prevents:
{ "role": "admin" } and your ORM happily writes it. Whitelist fields explicitly.id field arrives as ["1", "2"] and bypasses a string equality check.__proto__ and shifts behavior elsewhere in the app.Zod also gives you free OpenAPI docs through zod-to-openapi, which means your client SDKs and your validation are the same source of truth.
Validation runs both ways. Define a separate Zod schema for what leaves the endpoint and parse the response object through it before serializing. This catches the moment a developer adds password_hash to the user model and forgets to filter it out of GET /api/users/me. The Trello "private board" leak in 2023 was exactly this pattern.
For HTML fields rendered by other clients, run DOMPurify on output, not just input. Stored XSS is the gift that keeps giving.
IP-based rate limiting is theater. Most automated abuse rotates through residential proxy pools at $2 per GB. Rate-limit on the authenticated identity (user ID, API key ID, tenant ID) with Upstash Redis or Cloudflare Rate Limiting, and apply a separate stricter limit on unauthenticated routes keyed on IP plus user agent hash.
Set three tiers. A burst limit (10 requests per second per user) catches scripts. A sustained limit (500 per minute) catches scrapers. A daily quota (50,000 calls on the free plan) protects your unit economics. Return 429 with a Retry-After header; do not return 403, because clients retry 403s as bugs.
Access-Control-Allow-Origin: * is acceptable on exactly one kind of endpoint: public, unauthenticated, read-only data (status pages, public price lists). Everywhere else, allowlist the exact origins. Never reflect the Origin header without validation, which is a common Next.js middleware bug.
Access-Control-Allow-Credentials: true combined with a reflected origin is the classic full account takeover. If you need credentials, hardcode the origin list.
Every POST, PATCH, PUT, and DELETE writes a row to an append-only audit_log table: actor ID, IP, route, request hash, response status, timestamp. Stream it to a separate system (BetterStack, Axiom, or just a separate Postgres database with limited write credentials) so an attacker who roots the app cannot rewrite the trail.
This is what makes writing a postmortem after an incident tractable instead of guesswork. Without audit logs, the timeline is "we noticed something weird Tuesday."
Secrets go in a vault (Doppler, 1Password Secrets Automation, AWS Secrets Manager), never in .env files committed to git. Rotate signing keys quarterly with a versioned key ID in the JWT header so rotation is a config flip instead of a logout event.
Dependency scanning runs in CI on every push. Use npm audit --production, Snyk, or Socket.dev. Pin transitive dependencies with npm-shrinkwrap or pnpm's overrides block, because the 2024 polyfill.io supply-chain attack proved transitive trust is a fiction.
Here is the minimum acceptable shape for a write endpoint in a Next.js 15 app router project:
// app/api/invoices/[id]/route.ts
import { NextRequest, NextResponse } from "next/server";
import { z } from "zod";
import { rateLimit } from "@/lib/rate-limit";
import { getSession } from "@/lib/auth";
import { db } from "@/lib/db";
import { auditLog } from "@/lib/audit";
const InvoicePatch = z.object({
amount: z.number().int().positive().max(1_000_000),
notes: z.string().max(500).optional(),
}).strict(); // strict() rejects extra fields, killing mass assignment
const InvoiceOut = z.object({
id: z.string().uuid(),
amount: z.number(),
notes: z.string().nullable(),
updatedAt: z.string(),
});
export async function PATCH(
req: NextRequest,
{ params }: { params: { id: string } }
) {
const session = await getSession(req);
if (!session) return NextResponse.json({ error: "unauth" }, { status: 401 });
const rl = await rateLimit(`patch:invoice:${session.userId}`, 10, 60);
if (!rl.ok) {
return NextResponse.json({ error: "rate_limited" }, {
status: 429,
headers: { "Retry-After": String(rl.retryAfter) },
});
}
const body = InvoicePatch.safeParse(await req.json());
if (!body.success) {
return NextResponse.json({ error: body.error.flatten() }, { status: 400 });
}
// ABAC: load the row inside the user's tenant scope
const invoice = await db.invoice.findFirst({
where: { id: params.id, tenantId: session.tenantId },
});
if (!invoice) return NextResponse.json({ error: "not_found" }, { status: 404 });
const updated = await db.invoice.update({
where: { id: invoice.id },
data: body.data,
});
await auditLog({
actorId: session.userId,
route: "PATCH /api/invoices/:id",
resource: invoice.id,
status: 200,
});
return NextResponse.json(InvoiceOut.parse(updated));
}
That handler covers eight of the nine layers in under 50 lines. The ninth (TLS, CORS, HSTS) lives in middleware.ts and your hosting config.
These are the bugs that show up in nearly every API pentest report we see. Each maps to one or more of the layers above. For the full taxonomy and one Node fix per category, see our OWASP Top 10 implementation guide.
| Vulnerability | Where it hides | The control that catches it |
|---|---|---|
| Broken object-level auth (BOLA) | Handlers that trust the URL | RLS or ABAC inside the handler |
| Broken function-level auth | Admin routes behind security-by-obscurity | RBAC enforced in middleware |
| Mass assignment | ORM update(req.body) | z.object().strict() whitelist |
| SSRF | Any feature that fetches a user-supplied URL | Allowlist of egress hosts, block private IP ranges |
| Excessive data exposure | List endpoints that return full models | Output schema parsing |
| Injection (SQL, NoSQL, OS command) | String-concatenated queries | Parameterized queries, ORM with prepared statements |
| Improper rate limiting | Login, signup, password reset | Per-identity Upstash limiter |
| Misconfigured CORS | Wildcard or reflected origin | Hardcoded allowlist |
| Vulnerable dependencies | Transitive packages | CI dependency scan, lockfile audit |
| Token leakage | Logs, error pages, query strings | Header-only tokens, log scrubbing |
When teams ask which combination of tools to standardize on, this is the matrix we'd hand them. None of these are wrong; the trade-offs are about team size and operational appetite.
| Stack | Auth | Authz | Rate limit | WAF | Best fit |
|---|---|---|---|---|---|
| Cloudflare + Auth.js + Zod | Auth.js sessions | App-layer + Postgres RLS | Cloudflare Rate Limiting | Cloudflare WAF | Solo founders, 0-50k MAU |
| Vercel + Clerk + Upstash | Clerk B2B | Clerk Organizations + RLS | Upstash Redis | Vercel Firewall | Seed-stage SaaS, 50k-500k MAU |
| AWS API Gateway + Cognito + WAF | Cognito JWT | IAM + Lambda authorizers | API Gateway throttling | AWS WAF | Series A+, multi-region, compliance-driven |
| Kong + Keycloak + OPA | Keycloak OIDC | Open Policy Agent | Kong rate-limit plugin | ModSecurity | Enterprise on-prem, regulated industries |
Pick the leftmost stack that covers your audience. Upgrading later is mostly a routing config change; downgrading after you have wired Cognito groups into a hundred places is not.
iss, aud, and exp every time. Half the JWT bypasses in the wild come from servers that only check the signature.email of relation users does not exist" tells an attacker your schema. Return opaque error codes, log the detail server-side.If you are pre-revenue with a closed beta of 20 friends, you can defer the audit log and per-tier rate limits. You cannot defer TLS, auth, authorization, and input validation; those four are the floor. The day a real customer signs up is the day you turn on the rest.
If your API is read-only public data (a price feed, a status page), you can drop authentication entirely and lean on aggressive caching plus Cloudflare's bot management.
Wiring nine layers into an existing API is a 1-3 week project for someone who has done it before, and a 6-week project for someone who hasn't. Every engineer on Cadence is AI-native by default, vetted on Cursor and Claude Code fluency in a voice interview before they unlock bookings, which is why the median time to first commit on our platform is 27 hours. Most API hardening rollouts land in the Senior tier at $1,500 per week.
If you want a working baseline before booking anyone, audit your stack with Ship or Skip for an honest grade on your current API security posture. Then book a senior engineer for two weeks; the 48-hour free trial covers the audit, and you can cancel before week one ends if the diagnosis is wrong.
The same patterns appear in adjacent work like rolling out feature flags safely and designing a SaaS for HIPAA from day 1; if you are doing one, you usually need to do the others.
Booking a security-minded senior on Cadence takes about 2 minutes. You see four matched candidates, pick one, and you are paired by end of day. Weekly billing, no notice period, and the first 48 hours are free if the fit is wrong.
For a 30-endpoint API, plan on 2-3 weeks of senior engineering time to retrofit all nine layers, plus another week for audit log infrastructure and CI integration. Greenfield is faster: 3-5 days if you scaffold from a template that already includes the patterns.
Yes. A WAF catches traffic before it hits your runtime (cheaper, faster) and blocks known-bad signatures (Log4Shell, automated scanners) your app code shouldn't have to think about. Cloudflare's free plan covers most early-stage needs.
OAuth 2.0 is for authorization; OpenID Connect is the authentication layer on top. If you need to know who the user is (almost always), use OIDC. Auth.js, Clerk, WorkOS, and Keycloak all implement OIDC by default.
Key on IP plus a hash of User-Agent, set a strict ceiling (5 attempts per IP per 15 minutes for password reset), and require an hCaptcha challenge after the third failure.
Both. Internal scanning catches regressions; external pentests (annually for Series A and above) catch architectural issues your team is blind to. Budget $15-30k for a focused external test from a firm like Cure53, NCC Group, or Doyensec.
Senior technical support engineer at withRemote. Writes on incident response, runbook craft, and customer-empathy in engineering.