How to secure a SaaS API endpoint

Q: How do I rate-limit unauthenticated endpoints like signup and password reset?

Key on IP plus a hash of User-Agent, set a strict ceiling (5 attempts per IP per 15 minutes for password reset), and require an hCaptcha challenge after the third failure.

To secure a SaaS API endpoint, stack defenses in layers: TLS in transit, authenticated sessions or JWTs at the edge, per-object authorization in the handler, schema-validated input with Zod, rate limits per identity, strict CORS, and an immutable audit log of every state change. No single control protects you. The first six lines of your Next.js route handler decide whether the next breach is yours.

Why API security looks different in 2026

Customers ship raw HTTP clients now. Half your API traffic is no longer your own React frontend; it is a Zapier flow, a customer's Python script, an internal AI agent calling tools through MCP. Browser-only assumptions (same-origin cookies, CSRF tokens, a single login form) cover maybe 40 percent of real calls.

A leaked API token used by an agent runs at 50 requests per second instead of one click per minute. A BOLA bug that hid behind UI checks is now hit by every customer integration test. LLM-generated client code is excellent at finding mass-assignment bugs because it sends every field it can guess. Security has to live in the API, not the app shell.

The default approach and why it breaks

Most teams reach for a middleware stack: a CORS plugin, a JWT verifier, maybe express-rate-limit, and "we'll validate the body later." It works fine when there are three endpoints. It collapses around endpoint 40, when the team realizes the authorization check is copy-pasted into every handler, half of them check the wrong field, and nobody knows which routes are actually exposed to unauthenticated traffic.

The failure mode is consistent. The OWASP API Security Top 10 puts Broken Object-Level Authorization (BOLA) at number one for the third release in a row, because the bug is structural: route-level auth says "this user can hit /api/invoices/:id," but the handler forgets to check that the invoice belongs to that user. Both Optus (2022) and the more recent Authy enumeration leak came down to this.

The layered approach: nine controls, in order

Treat the list below as a checklist for every endpoint, not a one-time setup. Each layer assumes the one above it failed.

1. TLS everywhere, no exceptions

Terminate TLS 1.3 at your edge (Vercel, Cloudflare, Fly). Set HSTS with a one-year max-age and preload. Reject plaintext HTTP at the load balancer instead of redirecting; redirects leak the first request's URL and any query-string tokens.

Internal service-to-service traffic gets mTLS or a service mesh (Tailscale, Linkerd). The "internal network is trusted" assumption was killed by the 2024 Snowflake credential incident; act accordingly.

2. Authentication: pick one model per audience

Browsers get HttpOnly, Secure, SameSite=Lax session cookies signed with a rotating key. Server-to-server clients get scoped API keys (prefix them, like Stripe's sk_live_, so leaks are greppable on GitHub). Mobile and third-party integrators get short-lived JWTs (15 minutes) with refresh tokens stored in a revocation table.

Never mix the models on the same route. A handler that accepts both a cookie and a bearer token is a CSRF vector waiting to happen. Use libraries that have done this work: Auth.js (formerly NextAuth) for sessions, Clerk or WorkOS for B2B SSO, Hanko for passkeys.

3. Authorization: RBAC at the edge, ABAC or RLS at the row

Role-based access control answers "can this user call this route." That belongs in middleware. Attribute-based access control answers "can this user touch this row," and it must run inside the handler with the actual record loaded.

Postgres Row-Level Security is the cleanest pattern we've seen. Supabase made it mainstream; you can run the same CREATE POLICY statements on a vanilla Postgres database. RLS moves the check next to the data, which means a forgotten WHERE tenant_id = $1 cannot leak rows even if the handler is buggy. Pair it with a per-request session variable (SET LOCAL app.user_id = '...') and your worst-case BOLA is a 403 instead of a breach. The same logic underpins our multi-tenant Postgres schema guide, which goes deeper on tenant isolation patterns.

4. Input validation with Zod (or Valibot)

Every request body, query param, and path param gets parsed against a Zod schema before the handler runs. Reject on parse failure with a 400 and a structured error. Three things this prevents:

Mass assignment, where a client sends { "role": "admin" } and your ORM happily writes it. Whitelist fields explicitly.
Type confusion, where an id field arrives as ["1", "2"] and bypasses a string equality check.
Prototype pollution, where a JSON body sets __proto__ and shifts behavior elsewhere in the app.

Zod also gives you free OpenAPI docs through zod-to-openapi, which means your client SDKs and your validation are the same source of truth.

5. Output sanitization and shape control

Validation runs both ways. Define a separate Zod schema for what leaves the endpoint and parse the response object through it before serializing. This catches the moment a developer adds password_hash to the user model and forgets to filter it out of GET /api/users/me. The Trello "private board" leak in 2023 was exactly this pattern.

For HTML fields rendered by other clients, run DOMPurify on output, not just input. Stored XSS is the gift that keeps giving.

6. Rate limiting per identity, not per IP

IP-based rate limiting is theater. Most automated abuse rotates through residential proxy pools at $2 per GB. Rate-limit on the authenticated identity (user ID, API key ID, tenant ID) with Upstash Redis or Cloudflare Rate Limiting, and apply a separate stricter limit on unauthenticated routes keyed on IP plus user agent hash.

Set three tiers. A burst limit (10 requests per second per user) catches scripts. A sustained limit (500 per minute) catches scrapers. A daily quota (50,000 calls on the free plan) protects your unit economics. Return 429 with a Retry-After header; do not return 403, because clients retry 403s as bugs.

7. CORS: strict allowlist, no wildcards

Access-Control-Allow-Origin: * is acceptable on exactly one kind of endpoint: public, unauthenticated, read-only data (status pages, public price lists). Everywhere else, allowlist the exact origins. Never reflect the Origin header without validation, which is a common Next.js middleware bug.

Access-Control-Allow-Credentials: true combined with a reflected origin is the classic full account takeover. If you need credentials, hardcode the origin list.

8. Audit logging for every state change

Every POST, PATCH, PUT, and DELETE writes a row to an append-only audit_log table: actor ID, IP, route, request hash, response status, timestamp. Stream it to a separate system (BetterStack, Axiom, or just a separate Postgres database with limited write credentials) so an attacker who roots the app cannot rewrite the trail.

This is what makes writing a postmortem after an incident tractable instead of guesswork. Without audit logs, the timeline is "we noticed something weird Tuesday."

9. Secrets discipline and dependency scanning

Secrets go in a vault (Doppler, 1Password Secrets Automation, AWS Secrets Manager), never in .env files committed to git. Rotate signing keys quarterly with a versioned key ID in the JWT header so rotation is a config flip instead of a logout event.

Dependency scanning runs in CI on every push. Use npm audit --production, Snyk, or Socket.dev. Pin transitive dependencies with npm-shrinkwrap or pnpm's overrides block, because the 2024 polyfill.io supply-chain attack proved transitive trust is a fiction.

A Next.js API route that does all of this

Here is the minimum acceptable shape for a write endpoint in a Next.js 15 app router project:

// app/api/invoices/[id]/route.ts
import { NextRequest, NextResponse } from "next/server";
import { z } from "zod";
import { rateLimit } from "@/lib/rate-limit";
import { getSession } from "@/lib/auth";
import { db } from "@/lib/db";
import { auditLog } from "@/lib/audit";

const InvoicePatch = z.object({
  amount: z.number().int().positive().max(1_000_000),
  notes: z.string().max(500).optional(),
}).strict(); // strict() rejects extra fields, killing mass assignment

const InvoiceOut = z.object({
  id: z.string().uuid(),
  amount: z.number(),
  notes: z.string().nullable(),
  updatedAt: z.string(),
});

export async function PATCH(
  req: NextRequest,
  { params }: { params: { id: string } }
) {
  const session = await getSession(req);
  if (!session) return NextResponse.json({ error: "unauth" }, { status: 401 });

  const rl = await rateLimit(`patch:invoice:${session.userId}`, 10, 60);
  if (!rl.ok) {
    return NextResponse.json({ error: "rate_limited" }, {
      status: 429,
      headers: { "Retry-After": String(rl.retryAfter) },
    });
  }

  const body = InvoicePatch.safeParse(await req.json());
  if (!body.success) {
    return NextResponse.json({ error: body.error.flatten() }, { status: 400 });
  }

  // ABAC: load the row inside the user's tenant scope
  const invoice = await db.invoice.findFirst({
    where: { id: params.id, tenantId: session.tenantId },
  });
  if (!invoice) return NextResponse.json({ error: "not_found" }, { status: 404 });

  const updated = await db.invoice.update({
    where: { id: invoice.id },
    data: body.data,
  });

  await auditLog({
    actorId: session.userId,
    route: "PATCH /api/invoices/:id",
    resource: invoice.id,
    status: 200,
  });

  return NextResponse.json(InvoiceOut.parse(updated));
}

That handler covers eight of the nine layers in under 50 lines. The ninth (TLS, CORS, HSTS) lives in middleware.ts and your hosting config.

Common API vulnerabilities to defend against

These are the bugs that show up in nearly every API pentest report we see. Each maps to one or more of the layers above. For the full taxonomy and one Node fix per category, see our OWASP Top 10 implementation guide.

Vulnerability	Where it hides	The control that catches it
Broken object-level auth (BOLA)	Handlers that trust the URL	RLS or ABAC inside the handler
Broken function-level auth	Admin routes behind security-by-obscurity	RBAC enforced in middleware
Mass assignment	ORM `update(req.body)`	`z.object().strict()` whitelist
SSRF	Any feature that fetches a user-supplied URL	Allowlist of egress hosts, block private IP ranges
Excessive data exposure	List endpoints that return full models	Output schema parsing
Injection (SQL, NoSQL, OS command)	String-concatenated queries	Parameterized queries, ORM with prepared statements
Improper rate limiting	Login, signup, password reset	Per-identity Upstash limiter
Misconfigured CORS	Wildcard or reflected origin	Hardcoded allowlist
Vulnerable dependencies	Transitive packages	CI dependency scan, lockfile audit
Token leakage	Logs, error pages, query strings	Header-only tokens, log scrubbing

A comparison: API security check stacks

When teams ask which combination of tools to standardize on, this is the matrix we'd hand them. None of these are wrong; the trade-offs are about team size and operational appetite.

Stack	Auth	Authz	Rate limit	WAF	Best fit
Cloudflare + Auth.js + Zod	Auth.js sessions	App-layer + Postgres RLS	Cloudflare Rate Limiting	Cloudflare WAF	Solo founders, 0-50k MAU
Vercel + Clerk + Upstash	Clerk B2B	Clerk Organizations + RLS	Upstash Redis	Vercel Firewall	Seed-stage SaaS, 50k-500k MAU
AWS API Gateway + Cognito + WAF	Cognito JWT	IAM + Lambda authorizers	API Gateway throttling	AWS WAF	Series A+, multi-region, compliance-driven
Kong + Keycloak + OPA	Keycloak OIDC	Open Policy Agent	Kong rate-limit plugin	ModSecurity	Enterprise on-prem, regulated industries

Pick the leftmost stack that covers your audience. Upgrading later is mostly a routing config change; downgrading after you have wired Cognito groups into a hundred places is not.

Common pitfalls

Trusting JWT signature without validating claims. Verify iss, aud, and exp every time. Half the JWT bypasses in the wild come from servers that only check the signature.
Logging the full request body. PII, passwords, and tokens end up in Datadog forever. Scrub at the logger level with an allowlist of safe fields.
Returning detailed error messages in production. "Column email of relation users does not exist" tells an attacker your schema. Return opaque error codes, log the detail server-side.
Treating rate limiting as DDoS protection. It is abuse prevention; DDoS lives at the edge (Cloudflare, Fastly) and needs different math.
One-off security audits. A pentest in March does not cover the endpoint you shipped in April. The controls above need to be defaults in your code generator or scaffolding template.

When you can skip parts of this

If you are pre-revenue with a closed beta of 20 friends, you can defer the audit log and per-tier rate limits. You cannot defer TLS, auth, authorization, and input validation; those four are the floor. The day a real customer signs up is the day you turn on the rest.

If your API is read-only public data (a price feed, a status page), you can drop authentication entirely and lean on aggressive caching plus Cloudflare's bot management.

The Cadence connection

Wiring nine layers into an existing API is a 1-3 week project for someone who has done it before, and a 6-week project for someone who hasn't. Every engineer on Cadence is AI-native by default, vetted on Cursor and Claude Code fluency in a voice interview before they unlock bookings, which is why the median time to first commit on our platform is 27 hours. Most API hardening rollouts land in the Senior tier at $1,500 per week.

If you want a working baseline before booking anyone, audit your stack with Ship or Skip for an honest grade on your current API security posture. Then book a senior engineer for two weeks; the 48-hour free trial covers the audit, and you can cancel before week one ends if the diagnosis is wrong.

The same patterns appear in adjacent work like rolling out feature flags safely and designing a SaaS for HIPAA from day 1; if you are doing one, you usually need to do the others.

Booking a security-minded senior on Cadence takes about 2 minutes. You see four matched candidates, pick one, and you are paired by end of day. Weekly billing, no notice period, and the first 48 hours are free if the fit is wrong.

FAQ

How long does it take to harden an existing SaaS API?

For a 30-endpoint API, plan on 2-3 weeks of senior engineering time to retrofit all nine layers, plus another week for audit log infrastructure and CI integration. Greenfield is faster: 3-5 days if you scaffold from a template that already includes the patterns.

Do I need a WAF if I already have Zod validation and rate limiting?

Yes. A WAF catches traffic before it hits your runtime (cheaper, faster) and blocks known-bad signatures (Log4Shell, automated scanners) your app code shouldn't have to think about. Cloudflare's free plan covers most early-stage needs.

Is OAuth 2.0 enough, or do I also need OpenID Connect?

OAuth 2.0 is for authorization; OpenID Connect is the authentication layer on top. If you need to know who the user is (almost always), use OIDC. Auth.js, Clerk, WorkOS, and Keycloak all implement OIDC by default.

How do I rate-limit unauthenticated endpoints like signup and password reset?

Key on IP plus a hash of User-Agent, set a strict ceiling (5 attempts per IP per 15 minutes for password reset), and require an hCaptcha challenge after the third failure.

Should we run penetration tests internally or hire an external firm?

Both. Internal scanning catches regressions; external pentests (annually for Series A and above) catch architectural issues your team is blind to. Budget $15-30k for a focused external test from a firm like Cure53, NCC Group, or Doyensec.

Nimisha Mishra

Senior Technical Support Engineer

Senior technical support engineer at withRemote. Writes on incident response, runbook craft, and customer-empathy in engineering.

All posts