The original Cadence engineer interview was text. Three tasks: build something with Cursor, review a PR, design a prompt ladder. Submit code plus writing. Claude grades. It worked, but it had two structural problems: gameability, and weak signal for communication.
We replaced it with voice. Three prompts, one recording, 1 to 3 minutes total. Claude listens to the audio (it now supports audio input directly) and grades on AI-native fluency, communication, technical depth, and culture fit. The voice version correlates with founder ratings 3.2x better than text. False-positive rate dropped from 14% to 6%. Drop-off rate fell 47%.
This post is about why voice works better, what we got wrong with text, and what the next version of the interview looks like.
The text interview ran for 8 months. We graded thousands of submissions. Three problems showed up consistently:
Gameability. A candidate could spend 4 hours polishing what was supposed to be a 1-hour task. Strong candidates with limited time submitted reasonable work; less-strong candidates with more time submitted polished work. The interview rewarded time-investment, not actual skill.
Weak signal for communication. Text doesn't capture how the engineer thinks under time pressure or whether they can talk to a founder. The whole point of an AI-native engineer is that they can have a real conversation with a non-technical founder; text submissions don't show that.
High drop-off. The text interview took ~60 minutes to complete. About 65% of candidates abandoned it midway. The funnel was leaky in a way that selected for candidates with patience and free time, not for the skills we cared about.
The 14% false-positive rate (interview pass + bad founder rating) was the killer metric. One in seven engineers we approved turned out to be wrong picks. Founders absorbed the cost.
Three prompts, one recording. The candidate speaks for 1 to 3 minutes and submits.
Prompt 1. Walk me through a recent feature you built using AI tools. What did you delegate to AI vs do yourself?
Prompt 2. If a founder gave you a vague spec like "build a Stripe-like dashboard," how would you approach it? Use specific tools and prompts.
Prompt 3. What's a time AI gave you the wrong answer? How did you catch it?
These three test the five AI-native traits: tool fluency, prompt-as-spec discipline, verification habit, multi-step thinking, and human-in-the-loop instincts.
Claude transcribes the audio and grades on four dimensions:
The 50/100 floor unlocks the platform. 90+ unlocks senior and lead tiers.
Three structural differences between voice and text interview signal:
Voice exposes thinking pace. The candidate can't pause for 30 minutes mid-sentence. The flow of the answer reveals how they actually think about a problem. Engineers who've shipped real AI-native work answer the prompts in ~30 seconds with specifics. Engineers who've memorized AI buzzwords pause and stutter.
Voice exposes communication style. A founder hiring an engineer cares whether they can explain technical work clearly. Voice surfaces this immediately. Some engineers who are technically strong are catastrophic communicators; we need to know that before they take a booking.
Voice is harder to fake. A candidate can ask ChatGPT to write a polished text answer. They can't (yet) ask ChatGPT to deliver a polished spoken answer. Voice creates a real-time signal that's much harder to game with AI assistance.
Six months of voice interview vs the previous text version:
| Metric | Text interview | Voice interview | Delta |
|---|---|---|---|
| Correlation with founder ratings | 1.8x | 3.2x | +78% |
| False-positive rate | 14% | 6% | -8 pts |
| Average completion time | ~60 min | ~5 min | -92% |
| Drop-off rate | 65% | 18% | -47 pts |
| Engineer-reported retake-after-7-days success rate | 26% | 38% | +12 pts |
The 3.2x correlation with founder ratings is the most important number. The interview is supposed to predict whether the engineer will get good founder ratings on real bookings. Text correlated weakly; voice correlates strongly.
The 47% reduction in drop-off was unexpected. We thought voice would be intimidating ("I have to record myself!"). It was the opposite: 5 minutes of speaking is dramatically less daunting than 60 minutes of writing. Strong candidates finish the interview the day they sign up; the weakest 18% abandon, which is probably the right filter.
Voice catches a lot but it doesn't catch everything. Three gaps we're aware of:
Long-form judgment. Voice answers are 30-90 seconds each. They can't surface the kind of multi-day judgment that distinguishes a senior from a mid-tier engineer. We pick up some of this in the daily-rating loop after the engineer takes their first booking, but the voice interview alone doesn't fully filter.
Rare-skill specifics. A voice interview asks generic questions; it can't probe a candidate's specific Postgres expertise or their Rust async-runtime knowledge. The matching algorithm handles skill-specific filtering downstream, but the voice interview is generic.
Cultural alignment with specific founders. Some engineer-founder pairs work, some don't, in ways neither party predicts in advance. The 30-minute intro call after matching is what catches this. The voice interview is necessary but not sufficient.
We're filling these gaps with downstream signals (intro calls, daily ratings, re-booking rates) rather than trying to make the voice interview do too much.
We're piloting a public-repo deep analysis as a complement to the voice interview. The shape:
Engineers who pass the repo gate go to a comms interview (English IELTS-7-equivalent threshold). Engineers who pass both unlock the platform.
The voice interview becomes a smaller part of the screen. The repo signal is harder to fake (the GitHub history is public and verifiable); the comms gate is more objective (pass/fail at a fluency threshold).
This is in design as of 2026-Q2. We'll update the rubric here when it ships.
Three principles we'd recommend to any team designing an AI-native interview:
Test for AI-native skills explicitly. The traditional interview filters (algorithms, system design at scale, leetcode) test for skills AI now does in seconds. They no longer correlate with shipping. Replace them with prompts that specifically probe tool fluency, prompt-as-spec discipline, and verification habits.
Use voice or video, not text alone. Text is gameable with AI assistance. Voice exposes pace and communication style. Video adds visual cues but doesn't add much over voice for technical assessment.
Keep it short. A 5-minute interview that selects accurately beats a 60-minute interview that selects noisily. The signal-to-time ratio matters more than the time investment.
Will AI replace software developers covers the broader market shift in interview design. The short version: companies are quietly redoing their hiring loops to filter for AI-native fluency. The platforms (and companies) that don't update their interviews will keep mis-hiring.
See the voice interview yourself. Engineers can sign up and complete the voice interview in 5 minutes. Founders can book on Cadence and the voice-interview screening is already done for every matched candidate.
Five dimensions: tool fluency (Cursor / Claude / Copilot daily use), prompt-as-spec discipline, verification habits, multi-step prompt ladders, and human-in-the-loop instincts. Plus communication clarity and technical depth. 50/100 unlocks the platform; 90+ unlocks senior and lead tiers.
1 to 3 minutes for the recording itself. Total time including instructions and submission is ~5 minutes. Most candidates finish on their first sitting; we don't gate this behind appointments.
Yes, after 7 days. Retake-after-7-days success rate is 38%; engineers who study the rubric and try again with sharper preparation often pass.
Claude has direct audio input. It transcribes (which we use for the comms-quality signal) and listens to the actual delivery (pace, clarity, follow-through). The grading prompt provides the rubric; Claude returns a structured score across the four dimensions plus written feedback.
We tested it. Coding tests in interviews (live or take-home) correlate weakly with shipping ability in the AI-native era; AI does the coding in seconds. The skill that correlates is the engineer's judgment, communication, and verification habits, which voice surfaces faster.
A public-repo deep analysis layered on top of the voice interview. Authorship verification, complexity gates, LLM feature extraction, percentile mapping. In design as of 2026-Q2. The repo signal is harder to fake than voice and provides specific evidence rather than self-reported fluency.