Math for Adults: Closing the Day 1 Gap

The Opportunity

Math is one of Duolingo's biggest growth bets. Adults are the intended audience. But most of them leave before the end of Day 3. The window to turn a download into a habit is narrow -- and right now, we're not closing it.

"Can we get it so that your average 35-year-old decides to spend five to ten minutes a day getting better at math? I don't know. That is what the product team is tasked with."

-- Luis von Ahn, CEO, Duolingo. The Logan Bartlett Show, Ep. 87 (December 2023)

The question is framed exactly right. Adults who want to learn math are a real audience with real motivation. But motivation doesn't survive a first session that feels like it wasn't made for them. Day 1 through Day 3 is where we currently lose most of them -- and that's the gap this proposal is designed to close.

Meet Maya

👩

Maya, 34

Marketing Manager - Two kids - Lives in Chicago

Maya downloaded Duolingo Math after making a budget error at work. She knows her math is rusty and wants to fix it. She opens the app at 11pm after putting her kids to bed -- ten minutes, maybe fifteen if she's lucky.

What she sees

Grade-level questions with no context. No sense of where she is, how she got there, or why any of it is relevant to her life.

What she needs

Questions that feel connected to her actual situation -- budgets, percentages, the kind of math that shows up at work and at home.

What We're Hearing

Across platforms, adult learners describe a consistent pattern:

"Felt like it was made for a 10-year-old."

App Store

"Did one lesson, had no idea what level I was or why that mattered."

App Store

"Too basic and boring for anyone past middle school."

App Store

"I wanted to brush up on real-world math, not drill multiplication tables."

"The placement quiz was 2 questions and put me in Grade 3. I have a finance degree."

"Duolingo Spanish asked me why I was learning. Duolingo Math just... started. Felt off."

Reddit r/duolingo

"Dropped off after day 2. Just didn't feel like it was built for me."

App Store

01 -- The Team

1 Group PM (Math)

Owns the initiative end to end. Sets priorities, tracks metrics, coordinates across functions.

1 Growth Designer

Owns the intent screen, placement result screen, and Lesson 1 question UX.

2 Mobile Engineers

Build the intent capture flow and wire up the adaptive placement logic. Content is config -- no new infrastructure.

1 Data Scientist

Sets up event tracking before launch. Runs the experiment analysis at Week 6 and Week 8 checkpoints.

Cross-functional support: Content team (question banks for 4 intent paths), Localization (adapting intent labels), QA. Small team, high leverage, 12-week scope.

02 -- Defining Success

North Star Metric

Adults who complete 3 or more lessons in their first 7 days

This is a proxy for early habit formation. A single-day retention number misses too much -- someone who completes one lesson and never returns looks the same as someone who was interrupted. Three lessons in seven days means the behavior is starting to stick. This metric is adult-cohort specific, not rolled up with all Math users.

Metric	Baseline (est.)	Target	Kill Condition
Lesson 1 completion	~45%	60%	Below 50% at Week 6
Day 3 return rate	~20%	30%	Below 22% at Week 8
Day 7 retention	~12%	22%	Below 15% at Week 10
Placement abandonment	~35%	20%	Above 40% at Week 4
30-day active rate	~8%	15%	Below 10% at Week 12

Secondary signal: App Store review sentiment. Tracked qualitatively, not a hard target. If we're moving the numbers but reviews are still describing the same experience, something is off that the data isn't capturing.

03 -- Guardrails

In Scope

Intent capture screen (one screen before placement)
Adaptive placement quiz (5 questions, intent-matched)
Intent-matched Lesson 1 question banks
Skill-level result screen (replaces grade-level output)

Out of Scope

Curriculum changes or new math verticals
Evaluation or testing system (planned for Phase 2)
Changes to the core lesson engine
New infrastructure -- content is strings and config

Hard Guardrails

Sign-up conversion must not drop by more than 3% -- we stop if it does
All experiments use the existing lesson engine
Content is config, not code -- engineering lift is minimal

Phase 2 -- Long-Term Bets (if Phase 1 validates)

Voice Mode

After a wrong answer, the user explains their reasoning out loud. The app corrects the misconception conversationally.

Camera Math

Point your phone at a bill, a spreadsheet, or a homework problem. The app turns it into a lesson.

Skill Map

After 2 weeks, the user sees a personalized map of where they're strong and where the gaps are.

Identity Shift

A 30-day arc that takes someone from "I'm not a math person" to "look what I can do now" -- celebrating skill milestones with identity-based language, not just XP.

04 -- The Experiments

Three parallel experiments. We run all three, read the signal, and go where the data leads.

Experiment 1

Intent Anchor

One screen before placement begins, asking "What's your goal?" This is the same onboarding pattern Language already uses -- applied to Math. Adults who name a goal before their first lesson have a clearer reason to stay.

Everyday life (bills, budgets, tips) Work and career (spreadsheets, data, percentages) Helping my kid with homework Preparing for an exam

Hypothesis

Adults who pick a goal before their first lesson complete Lesson 1 at a higher rate than those who go straight to placement.

Measure

Lesson 1 completion rate, broken down by intent cohort vs. control group.

Kill Condition

No significant difference between cohorts and control at the 4-week mark.

Experiment 2

Adaptive Placement

Replace the current placement quiz with 5 real-world scenario questions, matched to the user's chosen intent. Replace the grade-level result ("Grade 5") with a skill-level result ("Level 3: Percentages & Everyday Math"). The level system runs from Level 1 (Foundations) to Level 6 (Advanced Applications).

Hypothesis

Skill-level framing reduces placement abandonment. Real-world scenarios place adults more accurately than abstract arithmetic.

Measure

Placement completion rate and Day 1 return rate, compared to current quiz.

Kill Condition

Placement abandonment increases vs. control at any checkpoint.

Experiment 3

Real World First

Redesign Lesson 1 so the very first question comes from the user's chosen intent -- not a generic arithmetic drill. Each intent path has its own 5-question Lesson 1 bank. The goal is for users to feel like the app was built for their situation from the first question.

Hypothesis

Feeling seen in question 1 drives stronger Day 3 return than a generic opening question.

Measure

Day 3 return rate by cohort, compared to control.

Kill Condition

No improvement at 8 weeks, or one cohort severely underperforms the others.

05 -- The Roadmap

12 weeks, 4 phases. Small bets, clear checkpoints, no surprises.

Weeks 1-2

Align and build

Finalize question banks -- 5 questions for each of the 4 intents. Agree on level names and copy for Levels 1 through 6. Set up event tracking so we can measure each experiment cleanly. Ship the intent anchor to 10% of new adult users to get early signal before a wider rollout.

Weeks 3-4

Placement redesign goes live

The new 5-question adaptive placement quiz launches. The skill-level result screen replaces the grade-level output. We watch placement completion rate and Day 1 return rate closely -- these are the first real signals on whether the reframe is working.

Weeks 5-8

Lesson 1 reframe goes live

Intent-matched question banks go live for each cohort. The full loop is now running: pick a goal, take placement, get a matched first lesson, see a streak prompt. We review the data at Week 6 and decide whether to continue each experiment or adjust the approach.

Weeks 9-12

Read the data, decide

Identify which cohorts performed best. Revise any underperforming question banks. Ship the winning variant to all new adult users. Use what we learned to define Phase 2 scope.

06 -- Decision Framework

We don't know which experiment will win. Here's how we decide:

Lesson 1 completion reaches 60% at Week 6 -- we keep the intent anchor and expand it to all new users.
Placement abandonment stays high -- we shorten the quiz to 3 questions. We'll know by question which one is the drop-off point.
One intent cohort clearly outperforms the others -- we double down on that path first and use it to inform the others.
No metric moves by Week 8 -- we stop the experiments and run a qualitative research sprint instead. The problem may be something we haven't identified yet.

Risks

Risk	Likelihood	Mitigation
Intent screen adds friction, fewer sign-ups	Medium	A/B test vs. full control group. Stop if conversion drops more than 3%.
5-question placement feels too long	Medium	Track drop-off by question number. Shorten to 3 questions if we see a clear falloff.
Question bank quality is uneven across intents	High	Test with 5 adult users per intent path before launch. Fix before we ship.
Adults skip the intent screen	Low	Make selection required in V1. Test optional in V2 once we have baseline data.
4 parallel content paths overwhelm engineering	Medium	All paths use the same lesson engine. Content is config, not code -- the engineering lift is minimal.

📱 See Prototype