How our matching algorithm scores 12,800 engineers in 80ms

When a founder books on Cadence, the matching algorithm scores every available engineer in real time and returns the top 4. Today the pool is 12,800 engineers. The pipeline runs in ~80ms p99. The score is a deterministic function of skills, rate fit, timezone overlap, ratings, and utilization. No human recruiter parses your spec. The signal is the math.

This post is the technical breakdown: how the algorithm works, why each weight is what it is, and the trade-offs we made deliberately.

The score formula

The match score is a weighted sum, normalized 0 to 100:

Dimension	Weight	What it measures
Skill fit	40%	Jaccard similarity over normalized skill set
Rate fit	20%	Piecewise function of engineer rate vs founder budget
Timezone overlap	15%	9-5 working-hour overlap, real (not coarse-TZ)
Engineer rating	15%	Average founder rating from past bookings
Utilization	10%	Brakes against busy engineers, fairness for new ones
Vertical bonus	up to +5	Match between founder vertical and engineer's history

The weights were calibrated against 800+ engagements over the last six months. The calibration process is straightforward: rerun historical bookings with various weight permutations, score against the actual founder-pick outcome, iterate. The current weights are the configuration that maximizes the agreement between the algorithm's top-4 and the founder's eventual choice.

Skill fit (40%)

Skills uses Jaccard similarity. Skills are normalized to a canonical set; "ReactJS", "react.js", and "React" all resolve to the same node. The score is |A ∩ B| / |A ∪ B| between the founder's required skills and the engineer's claimed skills.

A nuance: required skills are weighted higher than nice-to-haves. A founder who writes "React, Postgres, Stripe (nice to have)" is matched primarily on the first two. Stripe-only matches don't surface high in the ranking unless the founder's spec is heavy on Stripe.

The Jaccard choice is deliberate. We tried cosine similarity over an embedding (better at semantic adjacency: "React" implies likely "Next.js" knowledge) but it produced too many false positives at the long tail. Jaccard's strict set-equality is conservative and less noisy.

Rate fit (20%)

Rate compatibility is a piecewise function. 100 if the engineer's rate equals the founder's budget. Gentle taper if the engineer is below budget (budget $1,500, engineer $1,000 = 90 points; the founder gets a senior-tier match at mid-tier price, but the engineer's profile is less senior than the budget assumes). Harsh penalty if the engineer is above budget (engineer $2,000, budget $1,500 = 30 points; the math just doesn't work).

The asymmetry (gentle below, harsh above) reflects the asymmetric cost of a mis-match. A founder paying for a $1,500 engineer and getting senior-tier work is happy. A founder paying $1,500 and getting an engineer who insists on $2,000 is going to negotiate or churn.

Cadence's locked pricing tiers (junior $500, mid $1,000, senior $1,500, lead $2,000 weekly) make this calculation cleaner; engineers self-select tier and the founder sees the rate before they book.

Timezone overlap (15%)

The naive way: compute the time-zone offset between founder and engineer, score based on hours of overlap. This is what most platforms do. It's wrong for asynchronous remote work.

The right way: compute the working-hour overlap. A US-Pacific founder working 10am-6pm has 4 hours overlap with an India-Bengaluru engineer working 2pm-10pm IST (which translates to 1:30am-9:30am Pacific, so the overlap is the engineer's 1:30am-9:30am window vs the founder's 10am-6pm window: exactly zero).

The same Bengaluru engineer who works 7pm-3am IST (8:30am-4:30pm Pacific) overlaps with the founder's full working day. Same engineer, same time zone; very different working-window match.

We compute the actual schedule overlap, not the timezone offset. Engineers self-report their working window in local time. The matcher converts both to UTC and computes intersection. The result is more accurate than coarse TZ math by a meaningful margin.

Engineer rating (15%)

Average founder rating across past bookings. Ratings are daily binary (positive or negative); the score is the percentage of positive ratings out of total. New engineers (no rating history) get the platform median (currently 88%) until they have 30+ ratings.

The rating signal correlates strongly with continuation: engineers above 92% on the daily rating roll-forward get re-booked at 4x the rate of engineers below 80%. Founders who are using the rating loop get accurate signal within 1-2 weeks.

A subtlety: we drop the highest-and-lowest 10% of ratings before averaging. This dampens both the unfairly-harsh and the unfairly-generous founders. The math is closer to a trimmed mean than a straight average.

Utilization (10%)

A naive implementation prioritizes least-busy engineers. This creates a feedback loop where new engineers never get matched: the busy engineers get busier, the new engineers stay invisible. We don't want that.

Our utilization function is non-monotone:

Brand-new engineers (untested): 70 points
Available, 0 active bookings: 100 points
Available, 1 active booking: 80 points
Available, 2+ active bookings: 50 points

New engineers get a fairness bonus that's not maximum (they're untested) but not zero. Available engineers with one active booking are the sweet spot (proven, has capacity). Engineers with 2+ active bookings get a brake, both because of capacity concerns and because the platform incentive is to spread the work.

This bakes fairness in. New engineers see their first booking within 2-3 weeks of joining the platform; busy engineers don't accumulate so much work that quality suffers.

Vertical bonus (up to +5)

A founder building a fintech app gets a small score bonus for engineers with prior fintech work. Same for healthtech, edtech, marketplaces, e-commerce. The bonus caps at +5 on a 100-point scale; it's a tiebreaker, not a primary signal.

The reason it's not weighted higher: most software engineers ship across verticals over their career. Forcing a fintech-only filter would shrink the pool too much without proportionally improving match quality. The +5 nudge is enough to distinguish two otherwise-equal engineers.

Eligibility filters before scoring

Before any of the above, we filter the pool down hard:

AI-native voice interview score ≥ 50/100. Hard gate. Engineers who haven't passed don't appear in any matching. See What we mean by AI-native engineer for the rubric.
Workspace verified. Quiet space, internet 200+ Mbps, working webcam, working microphone.
Not suspended. Suspension flags for repeated no-shows, ratings issues, or reported behavior.

These three filters drop the pool from 12,800 to ~6,200 instantly. The rest of the matching runs only over eligible engineers.

The 50/100 floor on the AI-native voice interview is the most consequential filter. It's the reason every engineer in the matched top 4 is AI-native by default; founders don't have to filter for it themselves.

Tie-break and final selection

After scoring, we sort and take the top K (default K=4). Ties (rare but possible) are broken by:

Higher founder-rating-conversion rate (engineers whose intro calls more often convert to chosen)
Newer engineers (lower weeks-active), fairness signal

Then the system schedules 4 back-to-back 30-minute intro calls with the founder's chosen time slots. Each candidate gets a notification within 60 seconds; they have until the call time to accept or decline.

What we don't do

A few deliberate non-features:

No human recruiter in the loop. Recruiter discretion adds noise without adding signal at this scale.
No reputation-based bidding. Engineers don't bid for slots; the algorithm picks. This avoids the Upwork race-to-the-bottom dynamic.
No ML re-ranker (yet). After 200+ engagements per founder, we could learn personalized preferences ("this founder always picks engineers who talk less"). We're holding off until the data justifies the complexity. The deterministic algorithm already explains itself; an ML re-ranker would be a black box.
No keyword stuffing on engineer profiles. Skills must be backed by claimed projects; reviewers spot-check during the voice interview. Engineers who pad their skill list don't pass the AI-native screen.

Performance: how it runs in 80ms

The pipeline is pure functions in lib/matching.ts. 29 unit tests, no surprises.

The performance budget is broken down as:

Eligibility filter (Postgres index scan): ~5ms
Skill set Jaccard (in-memory, indexed engineers): ~30ms
Rate / timezone / rating / utilization scoring: ~10ms
Sort and slice top-K: ~5ms
DB write (booking_candidates insert × 4): ~25ms

The 80ms p99 number includes network and DB roundtrips. Under load, p50 is ~35ms.

We pre-compute most of the engineer-side fields (current relevance score, rating average, utilization) on a hot cache. The cache invalidates on engineer profile updates and on completed-booking events. This keeps the matching path read-only against materialized state.

Future work

Two things we're considering for v2:

Personalized re-ranker. After 200+ engagements per founder, learn that founder X always picks engineers who talk less and listen more. The signal becomes a personalized re-ranker on top of the global score.
Public-repo signal. Replace some of the AI-native voice-interview signal with deep public-repo analysis (authorship-verified GitHub work). Engineers with rich public portfolios get richer scoring; the voice interview becomes a smaller fraction of the input.

Both are scoped, neither is shipped. The current algorithm has been stable for two quarters and is doing well by the metrics that matter (founder NPS, trial-to-active conversion, booking continuation rates).

See it in action. Submit a booking on Cadence and the algorithm runs against the live pool in real time. You'll have 4 candidates and 4 intro-call slots within 2 minutes.

FAQ

How fast does the matching algorithm run?

~80ms p99 for the full pipeline (eligibility filter, scoring, sort, DB write). p50 is ~35ms. The 2-minutes-to-shortlist number includes everything from spec submission to engineer notification, which is dominated by network and notification roundtrip.

Why use Jaccard instead of cosine similarity?

Jaccard is conservative. Cosine over a semantic embedding produces too many false positives at the long tail (an engineer who ships React work doesn't necessarily ship Next.js work, even if the embeddings are close). For a hiring decision, false positives are more costly than false negatives.

How is timezone overlap actually computed?

Engineers self-report their working window in local time. The matcher converts both founder and engineer windows to UTC and computes intersection. This catches the case where an engineer in Bengaluru works 7pm-3am IST and overlaps perfectly with a US-Pacific founder, despite being on the "opposite" side of the world.

Do new engineers ever get matched?

Yes. The utilization function gives new engineers (untested) 70 points, which is competitive with available 1-active-booking engineers (80 points). Most new engineers get their first booking within 2-3 weeks of joining the platform.

Can a founder override the algorithm?

The algorithm returns the top 4. The founder picks 1 (or more) of those for the trial. If none of the 4 are right, the founder can resubmit with a refined spec; the algorithm re-runs. There's no manual override of the ranking; the algorithm is deterministic by design.

Will the algorithm change?

The current weights were calibrated against 800+ engagements. As the engagement count grows past 5,000, we'll re-calibrate. The structure (weighted-sum, eligibility filters, deterministic) is stable. The weights will tune.

All posts