
The day-1 stack for a startup is Sentry for errors, pino for structured logs shipped to Better Stack or Axiom, and a single Slack channel for alerts. Total cost: $0 to $50 a month until you're past 5,000 errors per month or 30 GB of log ingest. Skip PagerDuty until after product-market fit. The rule that matters more than the tools: one alert per failure mode, severity-tagged, no exceptions.
That's the whole answer. The rest of this post is how to ship it without creating noise that your team will mute in three weeks.
Three things changed since 2023 that make early observability a much bigger early investment than it used to be.
First, AI-assisted shipping speed went up. A team of two using Cursor and Claude Code ships features that would have taken eight engineers in 2022. The bug count per shipped feature didn't go down by the same factor. More shipping, same human review bandwidth, means production is where you find issues now. Observability moved from "nice to have post-Series-A" to "you need it on day 1 or you ship blind."
Second, log ingest pricing got sane. Datadog still charges enterprise rates, but Better Stack, Axiom, Highlight, and Baselime exist now. Free tiers cover the first 6 months of a real startup.
Third, AI Overviews and ChatGPT-driven traffic mean a 500 error on a product page now costs you both the conversion and the citation. You need to know within 90 seconds, not the next morning when a user emails support.
Most founders set up console.log and check Vercel's function logs when something goes wrong. This works until exactly the point where it doesn't, usually around 100 daily active users.
The failure mode looks like this. A user reports the checkout button is broken. You check Vercel logs. You see 400 log lines per minute, none of them structured, none of them correlated to the user's session. You spend 40 minutes grepping. The bug was a third-party webhook timeout that fired once at 2:14am, and the user retried at 9:30am with cached state. You will never find this with console.log.
The fix is structured logs plus an exception tracker. Together they take about 90 minutes to set up and they pay for themselves the first time you debug a production issue without scrolling.
Here's the minimum viable observability setup we'd ship into a fresh Next.js or Node service today.
| Layer | Tool | Cost at startup scale | Why this one |
|---|---|---|---|
| Error tracking | Sentry | Free up to 5,000 errors/mo | Best-in-category source maps, session replay, release tracking |
| Structured logs | pino (Node) or structlog (Python) | Free, open source | Fastest JSON logger, low overhead |
| Log aggregation | Better Stack or Axiom | $0 to $50/mo | Generous free tier, fast search, S3-backed cheap retention |
| Alert routing | Slack incoming webhook | Free | Where the team already lives |
| Paging | None on day 1 | $0 | Use PagerDuty only after PMF |
| Uptime checks | Better Stack Uptime or BetterUptime | Free for 10 monitors | Pings + status page in one tool |
Total day-1 cost: $0 to $50 per month. Total setup time: 90 to 120 minutes if you've done it before.
We'll cover each layer in order.
Install @sentry/nextjs or @sentry/node, paste in your DSN, deploy. Out of the box you get unhandled exceptions, source maps if you upload them in CI, release tagging if you set SENTRY_RELEASE, and a free tier of 5,000 errors per month and 10,000 performance units.
The configuration decisions that matter:
tracesSampleRate: 0.1 in production. Sampling at 10% catches the patterns without burning your performance quota.beforeSend PII redaction. Strip emails, tokens, and credit card patterns from event payloads before they leave your server. Sentry has a built-in EventScrubber but write your own regex pass too. It's 20 lines.release, environment, and userId (hashed). Without these, you can't tell which deploy introduced the regression or which customer hit it.Sentry.captureException(err, { tags: { feature: 'checkout' } }) at every try/catch boundary that owns a business operation. Don't rely on uncaught propagation alone.What can go wrong: leaving tracesSampleRate: 1.0 from a tutorial and burning your free tier in 11 days. We've seen it three times this year.
console.log outputs unstructured text. pino outputs newline-delimited JSON at roughly 5x the throughput. Switch the moment you have more than one service or more than one log stream worth searching.
The pattern that works:
import pino from 'pino';
export const logger = pino({
level: process.env.LOG_LEVEL || 'info',
redact: {
paths: ['req.headers.authorization', 'req.headers.cookie', '*.password', '*.token', '*.creditCard'],
censor: '[REDACTED]',
},
base: { service: 'api', env: process.env.NODE_ENV, release: process.env.SENTRY_RELEASE },
});
logger.info({ userId: hashedId, action: 'checkout.start', cartValue: 4900 }, 'checkout started');
Three things to notice. The redact config strips PII before serialization, not after. Every log line carries service, env, and release automatically. Each event has a stable action field you can query on (action:"checkout.start"), not a free-text message.
The free-text message is for humans skimming. The structured fields are for machines querying. Always include both.
pino writes to stdout. Your hosting platform (Vercel, Fly, Render, Railway) captures stdout. From there you ship to a log aggregator.
We recommend Better Stack or Axiom for a sub-$50/month bill. Both have generous free tiers (Axiom gives 500 GB/month free as of writing, Better Stack gives 1 GB and 3 days of retention free with reasonable paid tiers above that). Both support fast SQL-like queries. Both have Slack integration.
Datadog is technically more powerful, but you'll pay $1,500 to $4,000 a month before you have enough traffic to justify it. The same is true for Splunk. Save the migration for Series A.
Setup is 10 minutes: install the platform's log drain integration (Vercel has a one-click for both Better Stack and Axiom), set LOG_LEVEL=info in production, deploy. Logs appear within a minute.
One Slack channel called #alerts. Two webhook routes from Sentry: one for level:error and above, one for level:fatal. Tag every Sentry alert with a severity emoji at the start (🟡 warning, 🟠 error, 🔴 fatal). That's it.
The single rule that prevents alert fatigue: one alert per failure mode, not per occurrence. Sentry deduplicates by stack trace fingerprint by default; do not disable this. If Stripe webhook timeout fires 400 times in 10 minutes, that's one alert with a "400 events" counter, not 400 messages.
For your first 6 months, this is enough. You'll know within 90 seconds when production breaks, and the channel will stay quiet enough that people still look at it.
Most teams over-engineer this. Four levels, mapped to actions:
The trap is the temptation to add critical, urgent, severe, or service-specific levels. Don't. Engineers can't keep five severity scales in their head. The five above map cleanly to "do I act on this now, today, or never."
You won't need this for the first year. After that, sampling stops being optional.
At 10 million log lines a month you'll pay for ingest in real money. Three sampling strategies that work:
info in production, keep warn and above. You lose forensic context for normal flows; you can re-enable for a single service when investigating.The pattern is: sample noisy, keep rare, never sample errors. An error that gets sampled out is the one that turns into a customer-discovered bug at 4am.
If you ship a single log line with a customer email, password, or full credit card number to a third-party log service, you've created a compliance problem. SOC 2 auditors will flag it. GDPR makes it a reportable incident if EU users are involved.
The fixes:
redact block in pino) strips known PII fields before serialization. Don't trust downstream redaction.@.*\.com, 16-digit numbers, common token prefixes (sk_, pk_, Bearer ). Anything that shows up is a redaction miss.This connects directly to broader compliance work, including the basics in our guide on how to design a SaaS for HIPAA from day 1 if you're handling health data, and the patterns in implementing OWASP Top 10 mitigations for the rest of the surface area.
This is the single rule that separates teams who trust their alerts from teams who mute them.
A failure mode is a category of breakage, not an instance of it. "Stripe webhook handler throws" is a failure mode. "Stripe webhook handler threw at 14:32:18 for user X" is an instance. You want one alert per failure mode, with a counter of instances attached.
Three concrete rules that enforce this:
(3x) appended.When alert fatigue sets in (and you'll know because someone says "I muted #alerts last week"), the cause is almost always a violation of one of these three rules.
You do not need PagerDuty before product-market fit. The single biggest waste of money we see in pre-PMF startups is a $25-per-user-per-month PagerDuty bill protecting a service nobody is paying for yet.
The honest threshold is: page on-call only when (a) you have paying customers whose contract implies uptime, (b) you have at least two engineers who can actually respond, and (c) you've already shipped the "one alert per failure mode" discipline above. Without (c), you'll page someone every 40 minutes for the first week and they'll quit.
When you do hit that threshold, the stack is PagerDuty or Incident.io, integrated to Sentry's fatal-level webhook. Two on-call engineers, weekly rotation, one runbook per failure mode. Pair this with the practices in our post on writing a postmortem after an incident so you actually learn from every page.
Honestly: a lot, for the first 90 days.
If you are two founders pre-revenue, you do not need OpenTelemetry, you do not need a service mesh, you do not need distributed tracing across microservices (you do not have microservices). Ship Sentry, pino, Better Stack, and a Slack webhook. That covers 95% of the incidents you'll see in year one.
The things you can safely defer until you have actual scale or actual customers:
If you're choosing the underlying data layer at the same time, our guides on designing a multi-tenant Postgres schema and using Prisma in 2026 cover the patterns that pair best with the logging stack above.
For a 1-to-3 engineer team, observability setup is a half-day of work for a Mid or Senior engineer. The Mid tier ($1,000/week on Cadence) ships the day-1 stack: Sentry, pino, Better Stack, Slack webhooks, severity tags, the redaction layer. The Senior tier ($1,500/week) is the right call when you're adding tail-based sampling, OpenTelemetry, or migrating from a noisy legacy logger to structured logs without losing data.
If you're spending more than a day per week chasing production issues because you can't see what's happening, that's the signal to bring in help. Every engineer on Cadence is AI-native by default, vetted on Cursor, Claude Code, and Copilot fluency before they unlock bookings, so they ship observability setup at AI-assisted speed: the median time-to-first-commit on the platform is 27 hours from booking. You can audit your current stack if you're not sure what you actually need versus what a tutorial told you to install.
tracesSampleRate: 1.0 left from a tutorial. Free tier gone in under two weeks.release tag on Sentry events. You can't tell which deploy caused the regression.debug level in production. 50x the volume, 50x the cost.Each is a 10-minute fix once you spot it. Run a quarterly audit of log volume by route, alert volume by fingerprint, and Sentry quota burn rate.
If you have nothing in place today:
tracesSampleRate: 0.1, PII redaction on.console.log to pino with a redact block. 30 minutes.#alerts in Slack. Route Sentry error and fatal webhooks to it. 10 minutes.Total: two hours. You will catch the next outage in 90 seconds instead of 9 hours.
If you'd rather have someone else ship this end-to-end (with the redaction, sampling, and Slack routing dialed in for your stack), book a Mid or Senior engineer on Cadence for a one-week sprint. Weekly billing, 48-hour free trial, replace any week if it isn't working.
Zero to $50 per month for the first 6 months on Sentry's free tier (5,000 errors/mo) plus Better Stack or Axiom's free log tier. You'll start paying real money around 10,000 active users or 50 GB of monthly log ingest, typically $50 to $200 per month at that scale.
No. Datadog is excellent at scale but starts at roughly $15 per host per month for infrastructure monitoring and stacks up fast with APM, logs, and synthetics. Pre-Series-A startups burn $1,500 to $4,000 per month before getting value. Sentry plus Better Stack covers 95% of the same surface for under $50.
When you have paying customers whose contracts imply uptime, at least two engineers who can respond, and you've already enforced one-alert-per-failure-mode discipline so on-call isn't waking up every 40 minutes. Usually post-PMF, around Series A.
Three rules: group alerts by stack-trace fingerprint, suppress duplicates within a 5-minute window, auto-resolve fingerprints that haven't fired in 24 hours. Together these cut alert volume by 80 to 95% without losing signal.
Strip Authorization headers, cookies, password, token, creditCard, and email fields at the logger level (not downstream). Log hashed user IDs and Stripe customer IDs instead of raw PII. Audit quarterly with regex over a day of logs to catch redaction misses. Pair this with the safety practices in our guide on rolling out feature flags safely so new code paths don't leak through.
Senior automation engineer at withRemote. Writes on CI/CD, test pyramids, and removing toil from engineering pipelines.