How to write Playwright E2E tests

Q: How do I run Playwright in GitHub Actions?

Use a matrix with shard indexes ([1/4, 2/4, 3/4, 4/4]), install browsers with --with-deps chromium, run with --shard=${{ matrix.shard }}, and upload the playwright-report folder as an artifact on failure. The starter from npm init playwright@latest gives you a working baseline you can extend.

Q: Do I need to record videos for every test?

No. Set trace: 'on-first-retry' and video: 'retain-on-failure' in playwright.config.ts. You get full trace data for any test that flakes once, plus video on hard failures, without bloating CI artifacts on green runs.

To write Playwright E2E tests, install @playwright/test, run npx playwright codegen to record your first user flow, then rewrite the generated selectors to use role-based locators with web-first assertions. Add a page-object layer only once you have 10+ tests, reuse authentication via storageState, shard the suite in CI, and capture traces on failure for debugging.

The rest of this guide fills in the working code.

Why Playwright is the 2026 default

Playwright crossed the 1.50 line in early 2026 and is now the boring default for end-to-end browser testing. It runs Chromium, Firefox, and real WebKit, spins up parallel workers without plugins, ships a first-class trace viewer, and supports both UI and API testing in the same file.

If you are still weighing the choice against an incumbent, our side-by-side breakdown of Cypress vs Playwright in 2026 covers the trade-offs honestly. This post assumes you have already picked Playwright and want to ship a working suite. A 2026 suite has six moving parts:

Layer	What it does	When to add it
`@playwright/test` runner	Test discovery, parallel workers, reporters	Day 1
Locators (`getByRole`, etc.)	Stable, accessible selectors	Day 1
Fixtures	Per-test setup and teardown	Day 1
`storageState` reuse	Skip the login flow on every test	When auth is on the critical path
Page objects	Encapsulate flows you call from many tests	After ~10 tests
CI sharding	Cut wall-clock time by N	Once the suite passes 5 minutes

Install Playwright and write your first test in 90 seconds

From an empty project (or an existing Next.js, Remix, or Vite app):

npm init playwright@latest

The wizard creates playwright.config.ts, an example tests/example.spec.ts, a GitHub Actions workflow, and downloads the three browsers. Accept TypeScript and the GitHub Actions starter, then run npx playwright test. You should see the example pass in three browsers. If anything fails here, fix it before going further; CI will not fix it for you.

Open playwright.config.ts and set the fields that matter:

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  fullyParallel: true,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 4 : undefined,
  reporter: [['html', { open: 'never' }], ['list']],
  use: {
    baseURL: process.env.BASE_URL ?? 'http://localhost:3000',
    trace: 'on-first-retry',
    video: 'retain-on-failure',
    screenshot: 'only-on-failure',
  },
  projects: [
    { name: 'chromium', use: devices['Desktop Chrome'] },
    { name: 'firefox',  use: devices['Desktop Firefox'] },
    { name: 'webkit',   use: devices['Desktop Safari'] },
  ],
  webServer: {
    command: 'npm run dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
});

Three settings earn their weight. trace: 'on-first-retry' gives you the full trace viewer (DOM snapshots, network, console) only when a test actually fails. webServer boots your dev server before tests and tears it down after, so npm test works on a fresh clone with no manual steps.

Use codegen, then rewrite the locators

The fastest way to your second test is codegen:

npx playwright codegen http://localhost:3000

A browser opens, a recorder window appears, and every click and type lands as TypeScript. Codegen is a great starting point and a terrible ending point. The generated selectors lean too heavily on text and CSS, both of which break the moment a designer rewords a label or a class name shifts.

Refactor every test to follow this priority order:

page.getByRole('button', { name: 'Save' })
page.getByLabel('Email')
page.getByPlaceholder('you@example.com')
page.getByText('Welcome back') (only for assertions, not actions)
page.getByTestId('user-menu') (only when nothing semantic exists)

Roles work because they map to ARIA semantics, which is what assistive tech reads anyway. A button without a stable role and accessible name is a product bug, not a Playwright problem. Here is a real first test for a sign-up flow with zero CSS selectors:

import { test, expect } from '@playwright/test';

test('new user can sign up and reach dashboard', async ({ page }) => {
  await page.goto('/signup');

  await page.getByLabel('Work email').fill(`test+${Date.now()}@cadence.dev`);
  await page.getByLabel('Password').fill('Sup3rSecret!');
  await page.getByRole('button', { name: 'Create account' }).click();

  await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
  await expect(page).toHaveURL(/\/app\/dashboard/);
});

Notice expect(...).toBeVisible() not expect(await locator.isVisible()).toBe(true). Web-first assertions auto-retry up to the timeout, so you almost never need an explicit waitForSelector or waitForTimeout again.

Add page objects only when you need them

Page objects are the single most over-engineered piece of every test suite I review. You do not need them on day one. By the time you have ten tests that all start with "log in as admin, navigate to billing", the duplication will tell you it is time. A focused page object looks like this:

// tests/poms/billing.ts
import { Page, expect } from '@playwright/test';

export class BillingPage {
  constructor(private readonly page: Page) {}

  async goto() {
    await this.page.goto('/app/billing');
    await expect(this.page.getByRole('heading', { name: 'Billing' })).toBeVisible();
  }

  async startUpgrade(plan: 'pro' | 'team') {
    await this.page.getByRole('button', { name: `Upgrade to ${plan}` }).click();
  }

  async expectPlan(plan: string) {
    await expect(this.page.getByTestId('current-plan')).toHaveText(plan);
  }
}

Two rules. Page objects expose verbs the user would describe ("upgrade to Pro"), not implementation details. Page objects own their own assertions. Custom fixtures often replace half of what POMs used to do. If every test needs a logged-in BillingPage, make it a fixture:

// tests/fixtures.ts
import { test as base } from '@playwright/test';
import { BillingPage } from './poms/billing';

export const test = base.extend<{ billing: BillingPage }>({
  billing: async ({ page }, use) => {
    const billing = new BillingPage(page);
    await billing.goto();
    await use(billing);
  },
});

export { expect } from '@playwright/test';

Now any test importing test from ./fixtures gets a ready-to-go billing page. No beforeEach boilerplate.

Reuse authentication with storageState

Logging in through the UI on every test costs three to eight seconds per test. On a 200-test suite, that is ten minutes of pure login UI you do not need. The fix is a setup project that logs in once and writes the cookies and localStorage to disk. Every other test starts already authenticated.

// tests/auth.setup.ts
import { test as setup, expect } from '@playwright/test';

const authFile = 'playwright/.auth/user.json';

setup('authenticate', async ({ page }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill(process.env.TEST_USER_EMAIL!);
  await page.getByLabel('Password').fill(process.env.TEST_USER_PASSWORD!);
  await page.getByRole('button', { name: 'Sign in' }).click();
  await expect(page).toHaveURL(/\/app/);
  await page.context().storageState({ path: authFile });
});

Wire it into the config:

projects: [
  { name: 'setup', testMatch: /.*\.setup\.ts/ },
  {
    name: 'chromium',
    use: { ...devices['Desktop Chrome'], storageState: 'playwright/.auth/user.json' },
    dependencies: ['setup'],
  },
],

For multi-tenant or RBAC apps, run a setup project per role and point each downstream project at its own auth file. If you are designing the permissions layer itself, our guide to role-based access control covers the table model that makes multi-role testing tractable.

Network mocking, API testing, accessibility, visual

Four advanced patterns that earn their space in a real suite.

Network mocking with route + fulfill. When you want to test the loading state of a slow API, or simulate a 500 from Stripe without crashing your real Stripe account:

test('shows error toast when checkout fails', async ({ page }) => {
  await page.route('**/api/checkout', async (route) => {
    await route.fulfill({ status: 500, body: JSON.stringify({ error: 'card_declined' }) });
  });

  await page.goto('/checkout');
  await page.getByRole('button', { name: 'Pay $99' }).click();

  await expect(page.getByRole('alert')).toContainText('Something went wrong');
});

API testing alongside the browser. Playwright ships an HTTP client. You can call your backend directly to set up state, then assert against the UI:

test('admin can see new user in users table', async ({ page, request }) => {
  const created = await request.post('/api/users', {
    data: { email: 'new@example.com', role: 'member' },
  });
  expect(created.ok()).toBeTruthy();
  const { id } = await created.json();

  await page.goto('/app/users');
  await expect(page.getByRole('row', { name: /new@example\.com/ })).toBeVisible();

  await request.delete(`/api/users/${id}`);
});

Accessibility tests. Wire @axe-core/playwright into a fixture and run it on every page you visit:

import AxeBuilder from '@axe-core/playwright';

test('dashboard has no critical a11y violations', async ({ page }) => {
  await page.goto('/app/dashboard');
  const results = await new AxeBuilder({ page }).analyze();
  expect(results.violations.filter(v => v.impact === 'critical')).toEqual([]);
});

Visual regression with toMatchSnapshot. Playwright maintains baseline images per browser, per OS, per project. Update them with --update-snapshots:

test('marketing hero matches baseline', async ({ page }) => {
  await page.goto('/');
  await expect(page.getByRole('region', { name: 'Hero' })).toHaveScreenshot('hero.png', {
    maxDiffPixels: 50,
  });
});

Run visual tests on a single browser/OS pair (usually chromium on Linux). Cross-OS snapshots will diff on font rendering forever.

Parallelism, sharding, and CI

Playwright runs files in parallel across workers by default. Within a file, tests run serially unless you opt into test.describe.configure({ mode: 'parallel' }). Workers are the knob to tune; four is a sensible default for a 4-core CI runner.

Sharding cuts wall-clock time across machines. A 12-minute suite running on 8 shards finishes in roughly 90 seconds plus per-shard overhead. The flag is --shard=N/M where M is the shard count and N is the index. A working GitHub Actions workflow (drop in .github/workflows/e2e.yml):

name: E2E
on: [push, pull_request]

jobs:
  test:
    timeout-minutes: 20
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1/4, 2/4, 3/4, 4/4]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npx playwright install --with-deps chromium
      - run: npx playwright test --shard=${{ matrix.shard }}
        env:
          BASE_URL: http://localhost:3000
          TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
          TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
      - if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report-${{ matrix.shard }}
          path: playwright-report
          retention-days: 7

Three details earn copying. --with-deps chromium installs only one browser (saves a minute per run). The failure() artifact upload gives you the trace viewer download for any failed run. The matrix is keyed by shard count, so bumping 4 to 8 is a one-line change. If you are setting up CI from scratch, our GitHub Actions guide for Next.js apps covers the surrounding pieces (typecheck, lint, build cache).

Steps

Install Playwright. Run npm init playwright@latest, accept TypeScript and the GitHub Actions starter, then npx playwright test to confirm the example suite passes locally.
Write your first test with codegen. Run npx playwright codegen http://localhost:3000, record a smoke flow (sign up, create a thing, see it), then refactor every selector to getByRole, getByLabel, or getByTestId.
Add page objects after ten tests. When you start copy-pasting selectors across files, extract the duplicated flows into a tests/poms/ class with verbs that describe user intent, not DOM structure.
Convert repeated setup into fixtures. Move beforeEach patterns (logged-in user, seeded data, mocked APIs) into custom fixtures so each test reads as a single declaration.
Reuse auth with storageState. Add a setup project that logs in once, saves cookies and localStorage to playwright/.auth/user.json, and depend on it from every other project.
Wire CI with sharding and trace upload. Add the GitHub Actions YAML above with a shard: [1/4, 2/4, 3/4, 4/4] matrix and an artifact upload on failure so you can pull the trace viewer from any failed run.

Common pitfalls

A handful of patterns cause 90% of flaky Playwright suites:

waitForTimeout(2000). If you see this in a code review, reject it. Web-first assertions already wait. Use await expect(locator).toBeVisible() or await page.waitForResponse(/\/api\/users/) instead.
Logging in through the UI on every test. Use storageState once you have more than five tests behind auth.
Sharing data between tests. Tests must be independent. If test A creates a user that test B needs, both tests fail in parallel mode. Each test seeds its own data via request.post or a database fixture.
Snapshotting cross-OS. Visual diffs from font rendering will haunt you. Lock snapshots to one project (chromium on ubuntu-latest).
Ignoring traces. When CI fails, download the artifact, run npx playwright show-trace trace.zip, and watch the actual DOM rewind frame by frame. This is the single most useful debugging tool the runner ships.

When you can skip Playwright entirely

Two founders pre-revenue with a five-screen MVP do not need an E2E suite. They need ship velocity and a manual smoke checklist. Add Playwright when one of three things is true: real money flows through the app, a regression last week cost more than a day of engineering, or PRs are stepping on each other across more than two engineers.

If you are past that line and the retrofit work is bigger than your team can absorb in a sprint, a Cadence senior engineer at $1,500/week typically delivers a working CI-integrated suite (config, 30-50 tests, fixtures, sharded GitHub Actions) inside two weeks. Every engineer on Cadence is AI-native by default, vetted on Cursor, Claude Code, and Copilot fluency before they unlock bookings, which matters here because codegen-style scaffolding in an AI editor is now half the job.

If you would rather sanity-check the stack before adding tests, run your codebase through Cadence's Ship-or-Skip audit for an honest grade on what to fix first.

Try it. Cadence shortlists vetted engineers in 2 minutes, with a 48-hour free trial and weekly billing you can cancel anytime. If a Playwright rollout is on your roadmap and you do not have headcount, book a senior engineer and have a green CI badge by next Friday.

FAQ

Is Playwright better than Cypress in 2026?

For most teams, yes. Multi-browser including real WebKit, parallel workers without plugins, faster cold starts, and built-in API testing tip the balance. Cypress still wins on the time-travel debugger UI and on developer ergonomics inside the test runner. Pick Cypress only if your team already has it deployed and the migration cost outweighs the gains.

How long does it take to set up a Playwright suite?

About 90 seconds to your first passing test, an afternoon to write a usable smoke pack of 10 tests, and 1-2 weeks to wire CI with sharding, fixtures, storageState reuse, and visual regression for a real product. Allow extra time if your app has multi-tenant auth or heavy network setup.

Should I use page objects?

Not on day one. Add them once you have 10+ tests and start copy-pasting selectors across files. Custom fixtures often replace half of what POMs traditionally did, especially for "logged in as role X" setup. Use POMs for verb-shaped flows, fixtures for state-shaped setup.

How do I run Playwright in GitHub Actions?

Use a matrix with shard indexes ([1/4, 2/4, 3/4, 4/4]), install browsers with --with-deps chromium, run with --shard=${{ matrix.shard }}, and upload the playwright-report folder as an artifact on failure. The starter from npm init playwright@latest gives you a working baseline you can extend.

Do I need to record videos for every test?

No. Set trace: 'on-first-retry' and video: 'retain-on-failure' in playwright.config.ts. You get full trace data for any test that flakes once, plus video on hard failures, without bloating CI artifacts on green runs.

Can Playwright test mobile apps?

Not native iOS or Android. Playwright drives mobile browser viewports through device emulation and can hit real Mobile Safari and Mobile Chrome. For native React Native or Flutter, reach for Detox or Maestro instead.

All posts