How to do code reviews when your team is fully remote

Do code reviews fully remote by making them written-first, not call-first: keep PRs under 400 lines so async review fits inside a 24-hour SLA, run an AI-first pass with Greptile, CodeRabbit, or Claude Code before a human ever opens the diff, and cap human reviewers at three per PR. Tricky logic gets a 3-minute Loom from the author, and reviewers ship "approve with non-blocking nits" by default so PRs stop stalling on style debates.

That formula sounds simple. Most remote teams do not run it. They run a slower version of the in-office code review, where one senior bottlenecks 12 engineers and PRs sit open for four days because nobody wants to be the one to say "looks fine, ship it." This post is the operational version: tools, rules, scripts, and the four-line policy doc you can ship to your team this week.

Why most remote code review is broken

In-office code review was always partly social. You wandered over, opened the diff on a teammate's monitor, asked questions out loud, and the review took 15 minutes total. None of that survives the move to remote. What replaces it, by default, is one of three failure modes.

The senior bottleneck. Every PR routes to the same one or two people because nobody else is trusted. They review during whatever window they have, usually morning. PRs filed after 11am wait until tomorrow. Median time-to-merge stretches to 36 hours and the bottleneck reviewer gets burned out.

The Zoom review. The team tries to preserve the social part. PRs get scheduled into 30-minute calls. This works for the first month, then collapses because the team is in five timezones and nobody can find a slot. PRs sit waiting for a meeting that never happens.

The rubber stamp. The team gives up. PRs get approved within minutes with a thumbs-up emoji. Bugs ship. Confidence in the review process drops to zero, so people stop tagging reviewers at all and start merging their own PRs after one approval from a junior.

A 2025 GitClear study of 50,000 PRs across distributed teams found median review latency at 28 hours and median PR size at 612 lines. Both numbers are roughly twice what they should be. The fix is structural, not motivational.

The written-first principle

The single biggest change is: every review starts written, in the PR description. No DMs, no "hop on a call," no "let's discuss in standup." If a reviewer has a question, they write it as a PR comment. If the author has a complex change to explain, they write it in the PR description before requesting review.

Written-first has three properties that call-first does not:

Async-completable. A reviewer in Bali at 9pm can read the description, leave comments, and resolve their part of the loop without waking the author up in Berlin.
Indexable. Six months later, when someone asks "why did we go with this caching strategy," the answer is in the PR thread, not lost in a Zoom transcript nobody saved.
AI-readable. Greptile, CodeRabbit, and Claude Code can all parse PR descriptions and comment threads. Slack DMs and meeting notes are invisible to them.

Written-first does not mean banning calls. It means calls happen only after written comments have surfaced the actual disagreement, and the call's job is to resolve that one specific thing in 10 minutes. We cover the specifics of how the AI PR review workflow layers cleanly on top of written-first in a separate post; the short version is that AI handles syntax, style, and obvious bugs, and humans only get involved when judgment is required.

The small-PR rule: under 400 LOC, always

The single best predictor of fast review is PR size. Google's internal data, published in their engineering practices documentation, shows that PRs under 200 lines get reviewed in under 2 hours on average. PRs between 200 and 1,000 lines take roughly 24 hours. Anything over 1,000 lines stretches to multiple days because the reviewer schedules a "block of time" that never comes.

For remote teams, we set the cap at 400 lines of changed code (excluding generated files, lockfiles, and tests). The math: 400 lines is roughly 20 minutes of reviewer attention. Twenty minutes fits into any reviewer's day without scheduling. Anything bigger triggers an automatic "please split this" comment from a bot, before a human ever looks at it.

A 400-line cap forces architectural discipline. Engineers learn to ship behind feature flags, land scaffolding before logic, and merge refactors separately from features. Three 350-line PRs that each ship in 24 hours are dramatically better than one 1,050-line PR that takes a week. The total reviewer time is the same; the calendar time is one-seventh.

Cadence engineers are trained on this rhythm before they unlock bookings: every engineer on Cadence is AI-native by default, vetted on Cursor and Claude Code fluency, and a chunk of that vetting is whether they can take a vague spec and decompose it into 3 to 5 small PRs instead of one heroic dump.

AI-first review pass: what the bots catch before humans see it

Every PR on a well-run remote team goes through an AI review pass automatically, the moment the PR is opened. The author does not request it. CI does. This is the single biggest unlock of the last 18 months for distributed review.

Three tools dominate the category. Each has a slightly different niche.

Greptile indexes your whole repo as a graph and reviews PRs in the context of the codebase. It catches cross-file regressions (you changed a shared util's behavior, here are the 14 call sites that now break) that single-file tools miss. Pricing is $30 per developer per month.

CodeRabbit focuses on PR-level review with line-by-line comments and a summary. It is the most popular choice on GitHub for open-source projects because the free tier is generous. Paid is $24 per developer per month.

Claude Code as a reviewer is a newer pattern. Engineers run claude locally against the diff before pushing, or wire it into CI with a GitHub Action. Claude's strength is the same one it has in authoring: it explains reasoning, not just defects. Best for teams that already use Claude Code for authoring.

The AI pass catches the boring 60% of review feedback: missing null checks, off-by-one errors, inconsistent naming, missing tests for new branches, unused imports, accidental console.logs. Humans never have to leave that feedback again. The human review can focus entirely on judgment calls: is this the right abstraction, is this the right place for this logic, will this scale.

The two-pizza review group: cap reviewers at three

A 2024 study from MIT's CSAIL of PR review on 200 open-source projects found that PRs with 4 or more reviewers had longer time-to-merge than PRs with 2 or 3 reviewers, despite having more eyes on them. Past 3 reviewers, diminishing returns flip negative. Reviewers diffuse responsibility, debate style choices, and the author has to reconcile contradictory feedback.

We call this the two-pizza review group, after Amazon's two-pizza team rule. Cap human reviewers at three per PR. One of those three is the designated code owner for the touched directory (set via CODEOWNERS in GitHub). The other one or two are picked by the author, ideally domain experts for the change.

The CODEOWNERS file is doing a lot of work here. It eliminates the question "who do I tag." It also means review load is distributed by directory ownership rather than by social capital. A well-designed CODEOWNERS file is the single most underrated artifact for remote engineering teams; we wrote about how it interacts with our broader code reviews effectively playbook, including how to seed it on day one for a new repo.

Loom for tricky logic: when written-first hits its limit

Written-first does not mean text-only. Some changes are easier to explain by walking through them with your cursor. For those, the author records a 2 to 5 minute Loom and links it at the top of the PR description.

A good Loom for code review looks like this:

30 seconds: what problem you are solving
60 to 90 seconds: walk through the architectural choice (why this approach, not the obvious alternative)
60 to 90 seconds: trace the happy path through the diff
30 seconds: call out the parts you are least confident about

Reviewers watch the Loom at 1.5x, leave written comments, and the whole loop closes in under an hour even across timezones. The Loom replaces the "let's hop on a call" that used to bottleneck reviews on tricky logic.

The discipline is: Loom is the exception, not the rule. If every PR has a Loom, your PRs are too big and your descriptions are too thin. Looms cap out at 1 in 5 PRs on a healthy team.

"Approve with non-blocking nits": the pattern that stops PRs from stalling

The most common remote-review failure is not "bug shipped." It is "PR sat for 3 days because the reviewer wanted to discuss a variable name." The fix is a cultural pattern: approve with non-blocking nits.

The rule: if the PR is correct and shippable, you approve it. You can still leave comments about variable names, missing JSDoc, or a refactor opportunity. Those comments are prefixed nit: and explicitly marked non-blocking. The author decides whether to address them in this PR, a follow-up PR, or never.

This single pattern moves median review latency from 36 hours to under 8 hours on most teams that adopt it. It also makes reviewers braver: they leave more comments, not fewer, because they know the comments will not block shipping.

The companion rule: blocking comments must be specific and actionable. "This feels off" is not a blocking comment. "This will deadlock if two writes hit at once; here's the fix" is. If a reviewer wants to block a PR, they have to do the work of articulating exactly what is wrong and what would make it right.

Comparison: review styles for remote teams

There is no single "correct" remote review process. There are tradeoffs. Here is how the major patterns stack up for distributed teams.

Review style	Median time-to-merge	Async-compatible	AI-augmentable	Best for
Synchronous Zoom review	2 to 5 days	No	No	Co-located teams that went remote temporarily
Senior-bottleneck async	24 to 48 hours	Partial	Yes	Teams with 1 to 2 trusted seniors and ≤5 engineers
Written-first + AI pass + 2-pizza group	4 to 12 hours	Yes	Yes	Distributed teams from 4 engineers up to 100+
Rubber-stamp (1 thumbs-up)	30 minutes	Yes	No	Solo founders who need a sanity check, not review
Trunk-based with pair programming	Continuous	No	Partial	Co-located teams or 2-engineer pairs in same timezone

The pattern this post describes (written-first plus AI plus 2-pizza group plus approve-with-nits) is the only one that scales past 10 engineers in 3+ timezones without either burning out a senior or shipping bugs.

What to do this week

If you are running a remote engineering team right now and your review process is not working, here is the order of operations to fix it.

First, install CodeRabbit or Greptile on your main repo. Both have free tiers and take under an hour to wire up. Let it review every PR for 2 weeks before you change anything else. This alone will cut your reviewer load by roughly half.

Second, write a 4-line code review policy doc and pin it in your engineering channel: PRs under 400 lines, AI review runs first, max 3 human reviewers, approve with non-blocking nits. Do not negotiate these. If your team pushes back, they are telling you they prefer the old failure modes.

Third, set up CODEOWNERS in GitHub so review assignment is automatic, not social. Pair this with a remote engineering async standup template so the daily flow includes "PRs needing review" as a standing line item.

If you do not have the engineering capacity to run this rollout yourself, the fastest path is to bring on an engineer who has already done it. Every engineer on Cadence is AI-native by default and trained on the written-first review rhythm; you can find your remote engineer in 2 minutes and have the policy doc shipped by Friday.

Cadence ships you a vetted senior on a 48-hour free trial. Weekly billing, replace any week, no notice. If review hygiene is the bottleneck, an embedded senior can fix it in week one and document the policy your team can run forever.

FAQ

How long should a remote code review take?

For PRs under 400 lines, reviewer attention should be 15 to 25 minutes. Total time-to-merge from PR open to merge should be under 24 hours on a healthy distributed team, and under 8 hours if you have the AI-first pass and approve-with-nits patterns in place.

Should AI code review replace human reviewers?

No. AI catches roughly 60% of the routine feedback (missing tests, null checks, naming, style) but cannot make judgment calls about architecture, product fit, or whether a change belongs in this PR at all. Run AI first to clear the noise, then put a human on the judgment.

What's the right PR size for remote teams?

Cap at 400 lines of changed code excluding generated files and tests. Three 350-line PRs that ship in 24 hours each are dramatically better than one 1,050-line PR that takes a week, and the discipline of small PRs forces better architecture.

How do you onboard new engineers to your review process remotely?

Write the 4-line policy doc and have new engineers read it on day one. Then have them review 3 of your existing merged PRs as a training exercise before they author their own. This works async and pairs well with a 1-week remote engineer onboarding plan.

Is pair programming a substitute for code review on remote teams?

Only if your pairs are in the same timezone and pair on every change. For most distributed teams, pair programming is a useful tool for tricky work but not a substitute for async review. The audit trail and async-completable nature of written review is what makes it scale.

Neel Mehta

Co-Founder & COO

15+ years across startups, healthcare, marketing, sales, and IT. NIT Bhopal, Arizona State University. Built and exited companies. Writes on operations and founder-led growth.

All posts