
Designing a RESTful API endpoint in 2026 means picking a noun, picking an HTTP verb, and picking a status code. The endpoints that age well also handle idempotency, cursor pagination, structured errors, and the action-shaped requests that pure REST refuses to model. This post is the playbook we hand to engineers shipping API v1, with code in Hono, Express, and FastAPI.
If you want the strategic version of this advice (contract-first, OpenAPI as source of truth, codegen pipelines), read our companion post on API design best practices. This one is tactical. We are designing endpoints.
Three shifts make 2026 endpoint design different from the textbook version you read in 2018.
AI agents call your API more than humans do. Claude, GPT, and the Vercel AI SDK hit your endpoints through tool calls. They retry on timeouts, guess parameters, and expect flat JSON. An API built for a React form gets hammered when an agent calls it 40 times in a loop.
OpenAPI 3.1 became the de facto contract. It aligns with JSON Schema draft 2020-12, so your spec is also your validation rule, your codegen source, and your AI tool definition. FastAPI emits 3.1 by default; Hono ships @hono/zod-openapi. No excuse for a hand-written Postman collection in 2026.
RFC 9457 (problem details for HTTP APIs) replaced the wild west of error JSON. Top APIs (Stripe, Linear, Cloudflare) return errors with type, title, status, detail, and instance. Your client error parser is now boilerplate.
Strict REST also lost ground. The most-used APIs in 2026 (Stripe, GitHub, Linear, Resend) are RESTful in shape but ship action endpoints like POST /v1/payment_intents/{id}/cancel instead of pretending a cancellation is a PATCH.
The five-endpoint CRUD pattern is still the right default for most resources:
| Verb | Path | Purpose | Status on success |
|---|---|---|---|
| GET | /orders | List | 200 |
| POST | /orders | Create | 201 |
| GET | /orders/{id} | Read | 200 (or 304) |
| PATCH | /orders/{id} | Update | 200 |
| DELETE | /orders/{id} | Delete | 204 |
This shape works for entities with a clear lifecycle. It collapses the moment you hit a workflow.
Consider a refund. Is it PATCH /orders/{id} with body {status: "refunded"}? That hides side effects (charging a payment processor, emailing the customer, writing an audit log) inside a property update. Reviewers miss it; agents trigger it by accident; the 200 response makes a destructive operation look idempotent.
The 2026 fix is to admit some operations are verbs and give them their own endpoint. Stripe ships roughly 30 of these alongside its CRUD endpoints, and Google's AIP-136 codified the pattern.
Before you write a single route, you make three modeling decisions.
Model the API around the consumer's mental model, not the database schema. Your Postgres table might be payment_attempts, but if the consumer thinks of these as "charges", expose /charges. The closer the URL maps to the noun in the docs, the fewer support tickets you ship.
Start by listing the verbs the consumer wants to perform, then group them under the nouns those verbs operate on. Charge a card and Refund a charge both point at /charges. Subscribe a customer points at /subscriptions. Two passes through that exercise will shake out 80% of your routes.
Stripe shows the strongest example here. Every ID is prefixed with the resource type: cus_xxx for customers, pi_xxx for payment intents, sub_xxx for subscriptions, in_xxx for invoices. Linear, Resend, and Vercel all copied it.
Two benefits. A wrong-type ID fails at the edge with a clear 400, not deep in a query. And when you grep logs at 2am, cus_3f9a tells you what it is without a join.
// Hono, with zod
const customerId = z.string().regex(/^cus_[a-zA-Z0-9]{14}$/)
Two levels of nesting is the ceiling: /orders/{id}/refunds is fine; /customers/{id}/orders/{id}/refunds/{id} is a refactor waiting to happen. Past two levels, refunds get their own top-level resource and you reference the parent via query: GET /refunds?order=ord_3f9a.
This is the same call as picking SQL or NoSQL: the data shape determines the URL shape. If you are still deciding the underlying store, our take on SQL versus NoSQL in 2026 is the right pre-read.
Memorize this status-code matrix. It covers 95% of cases.
| Verb | Success | Validation fail | Auth fail | Not found | Conflict |
|---|---|---|---|---|---|
| GET | 200, 304 | 400 | 401, 403 | 404 | n/a |
| POST | 201 | 422 | 401, 403 | 404 (parent) | 409 |
| PATCH | 200 | 422 | 401, 403 | 404 | 409 |
| PUT | 200, 204 | 422 | 401, 403 | 404 | 409 |
| DELETE | 204 | n/a | 401, 403 | 404 | 409 |
Notes that bite people:
The colon-action pattern from Google's AIP-136 reads cleanest:
POST /orders/{id}:cancel
POST /invoices/{id}:send
POST /users/{id}:undelete
Some teams use /orders/{id}/actions/cancel instead. Both are fine. Pick one and document it. The point is that cancel is a verb with side effects, and the URL says so.
You ship action endpoints when:
cancel, archive, publish, approve).You skip them when the operation is a clean property update with no side effects. PATCH /users/{id} with {name: "new"} does not need a :rename action.
The default in 2026 is cursor pagination with opaque base64 tokens. Offset pagination still works for admin tables under 10,000 rows, but cursor scales to anything.
GET /orders?limit=25&starting_after=ord_3f9a
{
"data": [ { "id": "ord_4b2e", ... }, ... ],
"has_more": true,
"next_cursor": "b3JkXzRiMmU="
}
A few defaults to bake in:
has_more is cheaper than total. Computing total on a large table requires a count query. Skip it unless the consumer specifically needs it.For filtering, a flat querystring with documented operators handles most APIs:
GET /orders?status=paid&created[gte]=2026-01-01&customer=cus_3f9a
For sorting, comma-separated with a - prefix for descending: ?sort=-created,name. Don't invent a DSL on day one. If consumers ask for SQL-shaped queries, that's a sign you should expose GraphQL on a different surface, not bolt a query language onto REST.
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json
{
"type": "https://api.example.com/errors/validation",
"title": "Validation failed",
"status": 422,
"detail": "Two fields failed validation.",
"instance": "/orders",
"errors": [
{ "field": "amount", "code": "min", "message": "Must be at least 100." },
{ "field": "currency", "code": "enum", "message": "Must be one of USD, EUR, GBP." }
]
}
The errors[] array is the part developers actually fix bugs from. Field-level codes mean the client can render inline form errors without parsing English.
This is the single biggest change between a 2020 and 2026 endpoint. Any AI agent, any mobile client on a flaky network, any retry-on-timeout pattern in a queue worker can cause your POST /charges to fire twice.
The contract:
Idempotency-Key: <uuid> on POST.(key, request_hash, response_body, status) on first call.Skip idempotency on GET (already idempotent) and DELETE (already idempotent in spec). Bake it into POST and you sleep better.
The three options are URL (/v1/orders), header (API-Version: 2026-05-01), and query parameter (?version=1). All three work. URL versioning is boring, every CDN caches it correctly, every log line is grep-able, and every client supports it without custom config.
Header versioning (the GitHub style, calendar-versioned) sounds elegant and creates support tickets. Save it for when you have a dedicated API team to babysit it.
To make this concrete, here is POST /orders (create order, with idempotency, problem+json errors, typed IDs) in three runtimes.
import { Hono } from 'hono'
import { z } from 'zod'
const app = new Hono()
const CreateOrder = z.object({
customer: z.string().regex(/^cus_/),
amount: z.number().int().min(100),
currency: z.enum(['USD', 'EUR', 'GBP']),
})
app.post('/v1/orders', async (c) => {
const idempotencyKey = c.req.header('Idempotency-Key')
if (idempotencyKey) {
const cached = await c.env.KV.get(`idem:${idempotencyKey}`, 'json')
if (cached) return c.json(cached.body, cached.status)
}
const parsed = CreateOrder.safeParse(await c.req.json())
if (!parsed.success) {
return c.json({
type: 'https://api.example.com/errors/validation',
title: 'Validation failed',
status: 422,
errors: parsed.error.issues.map((i) => ({
field: i.path.join('.'),
code: i.code,
message: i.message,
})),
}, 422)
}
const order = await createOrder(c.env.DB, parsed.data)
const response = { body: order, status: 201 }
if (idempotencyKey) {
await c.env.KV.put(`idem:${idempotencyKey}`, JSON.stringify(response), {
expirationTtl: 86400,
})
}
return c.json(response.body, response.status)
})
Hono ships ~1ms cold start on Workers and the @hono/zod-openapi companion gives you OpenAPI 3.1 for free.
import express from 'express'
import { z } from 'zod'
const app = express()
app.use(express.json())
const CreateOrder = z.object({
customer: z.string().regex(/^cus_/),
amount: z.number().int().min(100),
currency: z.enum(['USD', 'EUR', 'GBP']),
})
app.post('/v1/orders', async (req, res) => {
const key = req.header('Idempotency-Key')
if (key) {
const cached = await redis.get(`idem:${key}`)
if (cached) {
const { status, body } = JSON.parse(cached)
return res.status(status).json(body)
}
}
const parsed = CreateOrder.safeParse(req.body)
if (!parsed.success) {
return res.status(422).type('application/problem+json').json({
type: 'https://api.example.com/errors/validation',
title: 'Validation failed',
status: 422,
errors: parsed.error.issues.map((i) => ({
field: i.path.join('.'),
code: i.code,
message: i.message,
})),
})
}
const order = await createOrder(parsed.data)
if (key) {
await redis.setex(`idem:${key}`, 86400, JSON.stringify({ status: 201, body: order }))
}
res.status(201).json(order)
})
from fastapi import FastAPI, Header, HTTPException, Response
from pydantic import BaseModel, Field
from typing import Literal, Optional
import json, redis
app = FastAPI()
r = redis.Redis()
class CreateOrder(BaseModel):
customer: str = Field(pattern=r"^cus_")
amount: int = Field(ge=100)
currency: Literal["USD", "EUR", "GBP"]
@app.post("/v1/orders", status_code=201)
async def create_order(
body: CreateOrder,
response: Response,
idempotency_key: Optional[str] = Header(default=None, alias="Idempotency-Key"),
):
if idempotency_key:
cached = r.get(f"idem:{idempotency_key}")
if cached:
data = json.loads(cached)
response.status_code = data["status"]
return data["body"]
order = await persist_order(body)
if idempotency_key:
r.setex(f"idem:{idempotency_key}", 86400, json.dumps({"status": 201, "body": order}))
return order
FastAPI ships OpenAPI 3.1 out of the box in 0.110+. The Pydantic model doubles as your validation, your docs, and your AI agent tool schema.
The contract is identical across all three. Pick the runtime that matches your team. Cadence engineers ship in all three; we have 12,800 vetted engineers in the pool and a 27-hour median time to first commit, so the framework choice is the founder's, not a constraint of who you can book.
A short list of patterns that look correct in code review and break in production.
{error: true} in the body. Status codes exist for a reason. Monitoring tools, retry libraries, and proxy caches all key off them. A 200 with an error body breaks every one of them.order_3f9a to ord_3f9a six months later breaks every integration silently.GET /orders/{id} returns the row anyway with deleted_at set, every downstream cache thinks the record exists. Soft-deleted rows return 404 to public consumers and 200 only to admin endpoints.id=12345 tells consumers your IDs are sequential integers, lets them probe for gaps, and breaks when you shard. Base64 an opaque token.REST is for public, multi-client, long-lived APIs. If your endpoint is none of those things, you have better options.
order { customer { subscriptions { invoices } } } in one round trip and you'd be inventing GraphQL badly to support it via REST.REST wins when the consumer set is open-ended and the contract has to outlive any one client. That's when the discipline pays off.
If you are designing an API right now, the order of operations is:
cus_, ord_, pi_). Validate them at the edge.:verb endpoints.If your team doesn't have the bandwidth to ship this in the next sprint and your API contract is blocking a launch, audit your stack honestly with our Ship-or-Skip tool before you commit to a 6-week build. Sometimes the right answer is to use Hono's starter template and ship in a week.
If you do need help, every engineer on Cadence is AI-native by default (Cursor, Claude, Copilot fluency vetted in a voice interview before they unlock bookings), and a senior at $1,500/week is the right tier for an API v1 rollout. You get a 48-hour free trial to see the work before you pay.
Default to PATCH with a partial JSON body. PUT requires sending the entire resource and most clients in 2026 send diffs. Reserve PUT for full replace operations like uploading a config file or replacing an entire user profile.
Use a /v1/ URL prefix. It's the boring choice that every client supports, every CDN caches correctly, and every engineer reads correctly in a log line. Header versioning sounds clean and creates support tickets.
422 Unprocessable Entity when the request is well-formed but semantically invalid; 400 Bad Request when the JSON is malformed. Pair it with a problem+json body listing field-level errors so clients can render inline form messages without parsing English.
If the collection can grow past 10,000 rows or the data is sortable by time (most feeds, logs, events, orders), yes. Switching from offset to cursor later is a breaking change unless you wrap both behind the same response shape from the start. The cost on day one is low; the cost on day 400 is a v2.
When the only consumer is your own frontend, reach for tRPC or Server Actions. When the consumer needs flexible joins across resources, GraphQL. When the use case is real-time, WebSockets or SSE. REST is for public, multi-client, long-lived APIs where the contract has to survive any one client.