Best practices for API design in 2026

Q: How do I version an API without breaking existing clients?

Use date-based version headers like Stripe's 2026-04-15 pattern. Never break a fielded version. Ship a deprecation timeline of at least 12 months for any field removal, and email the integrators on each pinned version when the deprecation clock starts.

API design in 2026 follows one rule: the OpenAPI spec is the contract, and that contract has to serve humans, codegen tools, and AI agents at the same time. Skip the spec-first workflow and you ship endpoints that drift from your docs, fail under agent retry loops, and rot into a maintenance bog within a year.

This is the playbook we'd give a senior engineer joining a team that's about to ship a public API. It's opinionated, it cites real tools, and it ends with the part most "best practices" posts skip: when you can ignore most of it because your team is two founders and a Postgres.

Why API design changed between 2023 and 2026

Two shifts make 2026 different from 2023.

First, AI agents now consume APIs autonomously. Claude, Cursor, and ChatGPT plugins all fetch your OpenAPI spec, generate tool calls, and execute them inside a loop. If your spec is incomplete, the agent guesses. Bad guesses cost real money: in late 2025, attackers harvested roughly 3.5 billion phone numbers from WhatsApp's contact-discovery endpoint because rate limits were too loose for an automated caller. Agents amplify every weak spot.

Second, OpenAPI 3.1 fully aligned with JSON Schema Draft 2020-12. That sounds boring. It isn't. It means the spec you write for documentation is the same schema your runtime validator uses, which is the same schema your client SDK generator uses. One file, three jobs. There's no longer a credible reason to maintain types in three places.

The practical effect: the OpenAPI document is the single source of truth, and contract-first design (write the spec before the handlers) becomes the default move. Everything below assumes that posture.

The default approach most teams still use

Most teams still go code-first. They scaffold a Fastify or FastAPI service, write decorators on each handler, and let the framework emit an OpenAPI document at /openapi.json. This works fine for an internal CRUD API with one frontend.

It breaks at three boundaries.

It breaks when an AI agent calls the API. The auto-generated spec usually omits the things humans infer (which fields are nullable, which combinations of query params are valid, what 422 actually means). Agents don't infer. They retry, hallucinate request bodies, or fail silently.

It breaks when codegen drifts. Your frontend team regenerates types weekly; your handler decorators change daily. Within a sprint, the types and the runtime disagree, and bugs show up as "undefined is not a function" in production.

It breaks when versioning gets ad-hoc. Coinbase and T-Mobile both shipped real authorization bugs in 2024-2025 that traced back to endpoints where the auth check lived in middleware, not in the handler, and a refactor moved one without the other. Spec-first design wouldn't have stopped the refactor, but it would have made the contract failure visible in code review.

The contract-first playbook in 6 steps

Here's the workflow we'd recommend for any new API in 2026. Six steps, each with the tool we'd actually reach for.

1. Write the OpenAPI 3.1 spec first

Author the spec in YAML, in the same repo as the code, reviewed in pull requests like any other source file. Use Stoplight Studio or Spotlight Elements for visual editing if YAML scares your team. The spec is not generated from handlers. The handlers are generated from the spec.

Why it matters: the spec is the prompt. Your AI assistant (Cursor, Claude Code, Copilot) reads it to write client code, server stubs, and tests. A clean spec produces clean generations. A vague spec produces vague code.

What can go wrong: teams treat the spec as a one-time scaffold and let it rot. Block this in CI by failing the build if the runtime response shape diverges from the spec (Schemathesis catches this).

2. Generate types, server stubs, and SDKs from the spec

Run openapi-typescript for TS types, oapi-codegen for Go server stubs, Speakeasy or Fern for client SDKs in 5 to 9 languages. Fern in particular is purpose-built for the spec-first flow and ships idiomatic SDKs that don't read like generated code.

Why it matters: every consumer (frontend, mobile, third-party integrator, AI agent) is reading from the same generated artifact. Drift becomes mechanically impossible.

What can go wrong: hand-editing generated files. Don't. If the SDK needs a method the spec doesn't describe, fix the spec.

3. Wire RBAC at the object level inside every endpoint

Authentication (who you are) is not authorization (what you can touch). Check ownership inside the handler, not just in middleware. Pattern: load the resource, compare resource.ownerId to req.user.id, return 404 (not 403) if they don't match. 404 leaks less.

Why it matters: this is the bug that breached Coinbase. Middleware confirmed the user was authenticated; the handler trusted that as authorization.

What can go wrong: developers add a "just for admins" path that skips the check. Make the ownership check a shared helper that every handler must call, and lint for handlers that don't.

4. Use cursor pagination by default

Offset pagination is fine for datasets under 10,000 rows. Above that, cursor pagination (an opaque token pointing to the last seen record) keeps query time constant regardless of page depth.

Why it matters: offset queries get slower as users scroll because the database still has to count past every skipped row. Cursors don't.

What can go wrong: leaking sortable IDs in cursors. Encode the cursor as base64 of an opaque payload signed with HMAC. Stripe does this. So does GitHub.

5. Adopt date-based versioning headers

Don't put /v2/ in your URL. Use a header like API-Version: 2026-04-15 (Stripe's pattern). Every breaking change becomes a dated event with a documented diff.

Why it matters: URL versioning forces you to maintain parallel endpoint trees and tempts you to break v1 because everyone "should have migrated by now." Date headers let old clients keep their pinned version forever and let new clients adopt features incrementally.

What can go wrong: forgetting to set a default. If a client sends no version header, pin them to whatever was current when they first authenticated, recorded against their API key.

6. Return RFC 9457 problem+json on every error

The application/problem+json format is the 2026 standard for API errors. Shape: type, title, status, detail, instance, plus any custom fields you want.

Why it matters: AI agents and retry libraries can parse this format mechanically. They can decide whether to retry, surface the error to the user, or ask for new credentials, without a human in the loop.

What can go wrong: returning 200 OK with an error body. This is the most common API design mistake we still see in 2026, and it breaks every off-the-shelf retry library. Use real status codes.

Designing for AI agents (the new consumer)

If your API will ever be called by an AI agent (and most public APIs will, by 2027), four extras matter.

Serve /openapi.json at a stable URL with no auth required for the schema itself. Serve an llms.txt at root that gives a 1-page natural-language summary of what the API does, plus links to the spec. The llms.txt convention reduces token consumption by roughly 90% versus making an agent crawl your docs site.

Make every state-changing endpoint accept an Idempotency-Key header. Agents retry. If POST /charges doesn't dedupe on idempotency key, you'll see double-charges within the first week of agent traffic. Stripe pioneered this; copy them.

Derive your MCP tool definitions from the OpenAPI spec. Don't hand-author them. The MCP server library for your language (Anthropic ships official ones for Python, TypeScript, Go) can read OpenAPI 3.1 and emit MCP tools directly.

Issue least-privilege scoped tokens for agents. A token with read:invoices should not be able to call POST /payments. Most APIs in 2026 still ship a single "API key" that does everything; this is a footgun in the agent era.

For founders thinking about how this connects to broader stack decisions, our take on SQL versus NoSQL choices in 2026 covers the database side of contract-first design.

A quick comparison: which API style fits which job

Approach	When it fits	Cost of switching later
Code-first (handlers then spec)	Internal-only, single client, throwaway tooling	High: regenerate every consumer
Contract-first OpenAPI	Public API, multi-client, AI agents	Low: spec is already authoritative
GraphQL schema-first	Many client shapes, mobile bandwidth-bound	Medium: federation rewrites
gRPC + protobuf	Internal microservices, sub-10ms latency	High: client retooling

REST with contract-first OpenAPI is the right default for roughly 80% of new APIs in 2026. Reach for GraphQL when your clients ask for radically different field shapes (the original Facebook mobile case). Reach for gRPC inside your own infrastructure when latency budgets are tight. WebSocket for true bidirectional streams (chat, collaborative editing).

Common pitfalls that look correct in code review

Five patterns that look fine on a PR diff but break in production.

Returning 200 OK with an error body. The handler caught the exception and returned { error: "not found" } with status 200. Now every retry library treats it as success and never retries the transient failure that actually warranted one.

Auto-incrementing IDs in URLs. GET /users/47 lets anyone enumerate your user count and probe /users/48 for IDOR bugs. Use UUIDs (or ULIDs if you want sortable IDs) in URLs. Keep numeric IDs for internal joins.

Mixing snake_case and camelCase across endpoints. One endpoint returns created_at, the next returns createdAt. Pick one (camelCase if your dominant client is JS; snake_case if Python or Ruby) and lint for it in CI.

Skipping rate limits on read endpoints. This is the WhatsApp lesson. A read endpoint that exposes "does this phone number have an account" needs aggressive rate limits even though no data is being mutated. Anything an attacker can iterate against is a candidate for harvesting.

URL-based versioning when you needed per-resource versioning. /v2/ versions everything together. If only /payments had a breaking change, you've forced every other resource to bump too. Date headers handle this gracefully; URL prefixes don't.

When you can skip most of this

Best practices have ROI curves. Respect them.

If you're two founders, pre-revenue, with one internal frontend talking to one Postgres, you do not need MCP tools, RFC 9457 problem+json, or date-based versioning. You need consistent naming, real HTTP status codes (200/201/400/401/404/422/500 used correctly), and an auth check that actually verifies ownership inside the handler. That's it.

The honest minimum for a pre-launch API:

Pick one casing convention and stick to it
Use the right status codes
Check ownership in every handler that touches user data
Document the endpoints in a single Markdown file or a Postman collection

Defer everything else until your first external integrator (or your first AI agent) hits the API. The day someone other than you writes a client against it is the day contract-first becomes worth the cost.

Who actually owns API design on a small team

On a 3 to 5 engineer team, one senior owns the contract. Everyone else implements against it. The senior writes the OpenAPI YAML, runs codegen, and reviews PRs that touch the spec the same way they'd review a database migration: carefully, with a sense that this is a permanent record.

On a 1 to 2 engineer team, the contract lives inside the founder's spec doc. An AI-native engineer drafts the OpenAPI spec during week one of the engagement, generates the SDKs, and hands the founder a clickable Mintlify docs site by Friday.

If you don't have that engineer in-house, Cadence ships them on demand. Every engineer on the platform is AI-native by baseline (Cursor and Claude Code are part of the standard kit, vetted in a voice interview before they unlock bookings), and the matching algorithm runs across our pool of 12,800 engineers in under 80ms (we wrote about how the matching scores work if you're curious). For contract-first API design, the senior tier at $1,500/week is the right slot. Junior tier at $500/week is who you book to implement against an existing spec, not to author one.

If you want a second opinion on whether your current API design is worth the rewrite, our ship-or-skip audit tool gives an honest grade in about ten minutes. It's free and it doesn't pitch you anything at the end.

What to do this week

If you're starting a new API: write the OpenAPI 3.1 spec for your first three endpoints before you write the handlers. Generate types with openapi-typescript or Fern. Serve /openapi.json and a one-page llms.txt from day one. Add idempotency-key support to every POST.

If you're retrofitting an existing API: run Schemathesis against your current handlers to get a baseline of where the runtime diverges from the spec. Fix the spec first, then the handlers. Don't try to do both at once.

If you don't know which to do: book a 48-hour trial with a senior engineer on Cadence. Two days is enough to get a real assessment and a written plan, with no commitment if the fit isn't right.

Try it: Run your stack through ship-or-skip and get a no-pitch grade on whether your API design needs work this quarter or can wait six months.

FAQ

Is REST still the right default for a new API in 2026?

Yes for CRUD-shaped resources with public consumers. Reach for GraphQL when client field needs vary wildly, gRPC for internal microservices needing sub-10ms latency, and WebSocket for true bidirectional streams (chat, collaborative editing, live dashboards).

Should every API endpoint be MCP-compatible?

No. MCP-compatibility matters for endpoints you want AI agents to call autonomously. Internal admin endpoints don't need it. Public read endpoints and agent-friendly actions (create, search, summarize) do. Generate the MCP tool list from a tagged subset of your OpenAPI spec.

How do I version an API without breaking existing clients?

Use date-based version headers like Stripe's 2026-04-15 pattern. Never break a fielded version. Ship a deprecation timeline of at least 12 months for any field removal, and email the integrators on each pinned version when the deprecation clock starts.

What's the cheapest way to add API documentation that stays current?

Generate it from the OpenAPI 3.1 spec with Mintlify, ReadMe, or Fern. The spec is the source of truth; the docs site is a view. If you maintain docs separately from the spec, they will drift within a sprint, no matter how disciplined the team is.

How long does a contract-first API rollout take?

For a fresh API, two engineering weeks for spec, codegen, and the first three endpoints in production. For a retrofit on an existing 50-endpoint API, plan 4 to 8 weeks depending on how far the handlers have drifted. A senior engineer working full-time on it (the kind that staffs the senior tier on Cadence at $1,500/week) is the right shape; a junior won't have the architectural judgment to make the spec decisions stick.

All posts