
To write Playwright E2E tests, install @playwright/test, run npx playwright codegen to record your first user flow, then rewrite the generated selectors to use role-based locators with web-first assertions. Add a page-object layer only once you have 10+ tests, reuse authentication via storageState, shard the suite in CI, and capture traces on failure for debugging.
The rest of this guide fills in the working code.
Playwright crossed the 1.50 line in early 2026 and is now the boring default for end-to-end browser testing. It runs Chromium, Firefox, and real WebKit, spins up parallel workers without plugins, ships a first-class trace viewer, and supports both UI and API testing in the same file.
If you are still weighing the choice against an incumbent, our side-by-side breakdown of Cypress vs Playwright in 2026 covers the trade-offs honestly. This post assumes you have already picked Playwright and want to ship a working suite. A 2026 suite has six moving parts:
| Layer | What it does | When to add it |
|---|---|---|
@playwright/test runner | Test discovery, parallel workers, reporters | Day 1 |
Locators (getByRole, etc.) | Stable, accessible selectors | Day 1 |
| Fixtures | Per-test setup and teardown | Day 1 |
storageState reuse | Skip the login flow on every test | When auth is on the critical path |
| Page objects | Encapsulate flows you call from many tests | After ~10 tests |
| CI sharding | Cut wall-clock time by N | Once the suite passes 5 minutes |
From an empty project (or an existing Next.js, Remix, or Vite app):
npm init playwright@latest
The wizard creates playwright.config.ts, an example tests/example.spec.ts, a GitHub Actions workflow, and downloads the three browsers. Accept TypeScript and the GitHub Actions starter, then run npx playwright test. You should see the example pass in three browsers. If anything fails here, fix it before going further; CI will not fix it for you.
Open playwright.config.ts and set the fields that matter:
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 4 : undefined,
reporter: [['html', { open: 'never' }], ['list']],
use: {
baseURL: process.env.BASE_URL ?? 'http://localhost:3000',
trace: 'on-first-retry',
video: 'retain-on-failure',
screenshot: 'only-on-failure',
},
projects: [
{ name: 'chromium', use: devices['Desktop Chrome'] },
{ name: 'firefox', use: devices['Desktop Firefox'] },
{ name: 'webkit', use: devices['Desktop Safari'] },
],
webServer: {
command: 'npm run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
},
});
Three settings earn their weight. trace: 'on-first-retry' gives you the full trace viewer (DOM snapshots, network, console) only when a test actually fails. webServer boots your dev server before tests and tears it down after, so npm test works on a fresh clone with no manual steps.
The fastest way to your second test is codegen:
npx playwright codegen http://localhost:3000
A browser opens, a recorder window appears, and every click and type lands as TypeScript. Codegen is a great starting point and a terrible ending point. The generated selectors lean too heavily on text and CSS, both of which break the moment a designer rewords a label or a class name shifts.
Refactor every test to follow this priority order:
page.getByRole('button', { name: 'Save' })page.getByLabel('Email')page.getByPlaceholder('you@example.com')page.getByText('Welcome back') (only for assertions, not actions)page.getByTestId('user-menu') (only when nothing semantic exists)Roles work because they map to ARIA semantics, which is what assistive tech reads anyway. A button without a stable role and accessible name is a product bug, not a Playwright problem. Here is a real first test for a sign-up flow with zero CSS selectors:
import { test, expect } from '@playwright/test';
test('new user can sign up and reach dashboard', async ({ page }) => {
await page.goto('/signup');
await page.getByLabel('Work email').fill(`test+${Date.now()}@cadence.dev`);
await page.getByLabel('Password').fill('Sup3rSecret!');
await page.getByRole('button', { name: 'Create account' }).click();
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
await expect(page).toHaveURL(/\/app\/dashboard/);
});
Notice expect(...).toBeVisible() not expect(await locator.isVisible()).toBe(true). Web-first assertions auto-retry up to the timeout, so you almost never need an explicit waitForSelector or waitForTimeout again.
Page objects are the single most over-engineered piece of every test suite I review. You do not need them on day one. By the time you have ten tests that all start with "log in as admin, navigate to billing", the duplication will tell you it is time. A focused page object looks like this:
// tests/poms/billing.ts
import { Page, expect } from '@playwright/test';
export class BillingPage {
constructor(private readonly page: Page) {}
async goto() {
await this.page.goto('/app/billing');
await expect(this.page.getByRole('heading', { name: 'Billing' })).toBeVisible();
}
async startUpgrade(plan: 'pro' | 'team') {
await this.page.getByRole('button', { name: `Upgrade to ${plan}` }).click();
}
async expectPlan(plan: string) {
await expect(this.page.getByTestId('current-plan')).toHaveText(plan);
}
}
Two rules. Page objects expose verbs the user would describe ("upgrade to Pro"), not implementation details. Page objects own their own assertions. Custom fixtures often replace half of what POMs used to do. If every test needs a logged-in BillingPage, make it a fixture:
// tests/fixtures.ts
import { test as base } from '@playwright/test';
import { BillingPage } from './poms/billing';
export const test = base.extend<{ billing: BillingPage }>({
billing: async ({ page }, use) => {
const billing = new BillingPage(page);
await billing.goto();
await use(billing);
},
});
export { expect } from '@playwright/test';
Now any test importing test from ./fixtures gets a ready-to-go billing page. No beforeEach boilerplate.
Logging in through the UI on every test costs three to eight seconds per test. On a 200-test suite, that is ten minutes of pure login UI you do not need. The fix is a setup project that logs in once and writes the cookies and localStorage to disk. Every other test starts already authenticated.
// tests/auth.setup.ts
import { test as setup, expect } from '@playwright/test';
const authFile = 'playwright/.auth/user.json';
setup('authenticate', async ({ page }) => {
await page.goto('/login');
await page.getByLabel('Email').fill(process.env.TEST_USER_EMAIL!);
await page.getByLabel('Password').fill(process.env.TEST_USER_PASSWORD!);
await page.getByRole('button', { name: 'Sign in' }).click();
await expect(page).toHaveURL(/\/app/);
await page.context().storageState({ path: authFile });
});
Wire it into the config:
projects: [
{ name: 'setup', testMatch: /.*\.setup\.ts/ },
{
name: 'chromium',
use: { ...devices['Desktop Chrome'], storageState: 'playwright/.auth/user.json' },
dependencies: ['setup'],
},
],
For multi-tenant or RBAC apps, run a setup project per role and point each downstream project at its own auth file. If you are designing the permissions layer itself, our guide to role-based access control covers the table model that makes multi-role testing tractable.
Four advanced patterns that earn their space in a real suite.
Network mocking with route + fulfill. When you want to test the loading state of a slow API, or simulate a 500 from Stripe without crashing your real Stripe account:
test('shows error toast when checkout fails', async ({ page }) => {
await page.route('**/api/checkout', async (route) => {
await route.fulfill({ status: 500, body: JSON.stringify({ error: 'card_declined' }) });
});
await page.goto('/checkout');
await page.getByRole('button', { name: 'Pay $99' }).click();
await expect(page.getByRole('alert')).toContainText('Something went wrong');
});
API testing alongside the browser. Playwright ships an HTTP client. You can call your backend directly to set up state, then assert against the UI:
test('admin can see new user in users table', async ({ page, request }) => {
const created = await request.post('/api/users', {
data: { email: 'new@example.com', role: 'member' },
});
expect(created.ok()).toBeTruthy();
const { id } = await created.json();
await page.goto('/app/users');
await expect(page.getByRole('row', { name: /new@example\.com/ })).toBeVisible();
await request.delete(`/api/users/${id}`);
});
Accessibility tests. Wire @axe-core/playwright into a fixture and run it on every page you visit:
import AxeBuilder from '@axe-core/playwright';
test('dashboard has no critical a11y violations', async ({ page }) => {
await page.goto('/app/dashboard');
const results = await new AxeBuilder({ page }).analyze();
expect(results.violations.filter(v => v.impact === 'critical')).toEqual([]);
});
Visual regression with toMatchSnapshot. Playwright maintains baseline images per browser, per OS, per project. Update them with --update-snapshots:
test('marketing hero matches baseline', async ({ page }) => {
await page.goto('/');
await expect(page.getByRole('region', { name: 'Hero' })).toHaveScreenshot('hero.png', {
maxDiffPixels: 50,
});
});
Run visual tests on a single browser/OS pair (usually chromium on Linux). Cross-OS snapshots will diff on font rendering forever.
Playwright runs files in parallel across workers by default. Within a file, tests run serially unless you opt into test.describe.configure({ mode: 'parallel' }). Workers are the knob to tune; four is a sensible default for a 4-core CI runner.
Sharding cuts wall-clock time across machines. A 12-minute suite running on 8 shards finishes in roughly 90 seconds plus per-shard overhead. The flag is --shard=N/M where M is the shard count and N is the index. A working GitHub Actions workflow (drop in .github/workflows/e2e.yml):
name: E2E
on: [push, pull_request]
jobs:
test:
timeout-minutes: 20
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shard: [1/4, 2/4, 3/4, 4/4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20, cache: npm }
- run: npm ci
- run: npx playwright install --with-deps chromium
- run: npx playwright test --shard=${{ matrix.shard }}
env:
BASE_URL: http://localhost:3000
TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
- if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-report-${{ matrix.shard }}
path: playwright-report
retention-days: 7
Three details earn copying. --with-deps chromium installs only one browser (saves a minute per run). The failure() artifact upload gives you the trace viewer download for any failed run. The matrix is keyed by shard count, so bumping 4 to 8 is a one-line change. If you are setting up CI from scratch, our GitHub Actions guide for Next.js apps covers the surrounding pieces (typecheck, lint, build cache).
npm init playwright@latest, accept TypeScript and the GitHub Actions starter, then npx playwright test to confirm the example suite passes locally.npx playwright codegen http://localhost:3000, record a smoke flow (sign up, create a thing, see it), then refactor every selector to getByRole, getByLabel, or getByTestId.tests/poms/ class with verbs that describe user intent, not DOM structure.beforeEach patterns (logged-in user, seeded data, mocked APIs) into custom fixtures so each test reads as a single declaration.storageState. Add a setup project that logs in once, saves cookies and localStorage to playwright/.auth/user.json, and depend on it from every other project.shard: [1/4, 2/4, 3/4, 4/4] matrix and an artifact upload on failure so you can pull the trace viewer from any failed run.A handful of patterns cause 90% of flaky Playwright suites:
waitForTimeout(2000). If you see this in a code review, reject it. Web-first assertions already wait. Use await expect(locator).toBeVisible() or await page.waitForResponse(/\/api\/users/) instead.storageState once you have more than five tests behind auth.request.post or a database fixture.chromium on ubuntu-latest).npx playwright show-trace trace.zip, and watch the actual DOM rewind frame by frame. This is the single most useful debugging tool the runner ships.Two founders pre-revenue with a five-screen MVP do not need an E2E suite. They need ship velocity and a manual smoke checklist. Add Playwright when one of three things is true: real money flows through the app, a regression last week cost more than a day of engineering, or PRs are stepping on each other across more than two engineers.
If you are past that line and the retrofit work is bigger than your team can absorb in a sprint, a Cadence senior engineer at $1,500/week typically delivers a working CI-integrated suite (config, 30-50 tests, fixtures, sharded GitHub Actions) inside two weeks. Every engineer on Cadence is AI-native by default, vetted on Cursor, Claude Code, and Copilot fluency before they unlock bookings, which matters here because codegen-style scaffolding in an AI editor is now half the job.
If you would rather sanity-check the stack before adding tests, run your codebase through Cadence's Ship-or-Skip audit for an honest grade on what to fix first.
Try it. Cadence shortlists vetted engineers in 2 minutes, with a 48-hour free trial and weekly billing you can cancel anytime. If a Playwright rollout is on your roadmap and you do not have headcount, book a senior engineer and have a green CI badge by next Friday.
For most teams, yes. Multi-browser including real WebKit, parallel workers without plugins, faster cold starts, and built-in API testing tip the balance. Cypress still wins on the time-travel debugger UI and on developer ergonomics inside the test runner. Pick Cypress only if your team already has it deployed and the migration cost outweighs the gains.
About 90 seconds to your first passing test, an afternoon to write a usable smoke pack of 10 tests, and 1-2 weeks to wire CI with sharding, fixtures, storageState reuse, and visual regression for a real product. Allow extra time if your app has multi-tenant auth or heavy network setup.
Not on day one. Add them once you have 10+ tests and start copy-pasting selectors across files. Custom fixtures often replace half of what POMs traditionally did, especially for "logged in as role X" setup. Use POMs for verb-shaped flows, fixtures for state-shaped setup.
Use a matrix with shard indexes ([1/4, 2/4, 3/4, 4/4]), install browsers with --with-deps chromium, run with --shard=${{ matrix.shard }}, and upload the playwright-report folder as an artifact on failure. The starter from npm init playwright@latest gives you a working baseline you can extend.
No. Set trace: 'on-first-retry' and video: 'retain-on-failure' in playwright.config.ts. You get full trace data for any test that flakes once, plus video on hard failures, without bloating CI artifacts on green runs.
Not native iOS or Android. Playwright drives mobile browser viewports through device emulation and can hit real Mobile Safari and Mobile Chrome. For native React Native or Flutter, reach for Detox or Maestro instead.