The Opportunity
Math is one of Duolingo's biggest growth bets. Adults are the intended audience. But most of them leave before the end of Day 3. The window to turn a download into a habit is narrow -- and right now, we're not closing it.
"Can we get it so that your average 35-year-old decides to spend five to ten minutes a day getting better at math? I don't know. That is what the product team is tasked with."
The question is framed exactly right. Adults who want to learn math are a real audience with real motivation. But motivation doesn't survive a first session that feels like it wasn't made for them. Day 1 through Day 3 is where we currently lose most of them -- and that's the gap this proposal is designed to close.
Meet Maya
👩
Maya, 34
Marketing Manager - Two kids - Lives in Chicago
Maya downloaded Duolingo Math after making a budget error at work. She knows her math is rusty and wants to fix it. She opens the app at 11pm after putting her kids to bed -- ten minutes, maybe fifteen if she's lucky.
What she sees
Grade-level questions with no context. No sense of where she is, how she got there, or why any of it is relevant to her life.
What she needs
Questions that feel connected to her actual situation -- budgets, percentages, the kind of math that shows up at work and at home.
What We're Hearing
Across platforms, adult learners describe a consistent pattern:
"Felt like it was made for a 10-year-old."
App Store
"Did one lesson, had no idea what level I was or why that mattered."
App Store
"Too basic and boring for anyone past middle school."
App Store
"I wanted to brush up on real-world math, not drill multiplication tables."
Reddit
"The placement quiz was 2 questions and put me in Grade 3. I have a finance degree."
Reddit
"Duolingo Spanish asked me why I was learning. Duolingo Math just... started. Felt off."
Reddit r/duolingo
"Dropped off after day 2. Just didn't feel like it was built for me."
App Store
1 Group PM (Math)
Owns the initiative end to end. Sets priorities, tracks metrics, coordinates across functions.
1 Growth Designer
Owns the intent screen, placement result screen, and Lesson 1 question UX.
2 Mobile Engineers
Build the intent capture flow and wire up the adaptive placement logic. Content is config -- no new infrastructure.
1 Data Scientist
Sets up event tracking before launch. Runs the experiment analysis at Week 6 and Week 8 checkpoints.
Cross-functional support: Content team (question banks for 4 intent paths), Localization (adapting intent labels), QA. Small team, high leverage, 12-week scope.
North Star Metric
Adults who complete 3 or more lessons in their first 7 days
This is a proxy for early habit formation. A single-day retention number misses too much -- someone who completes one lesson and never returns looks the same as someone who was interrupted. Three lessons in seven days means the behavior is starting to stick. This metric is adult-cohort specific, not rolled up with all Math users.
| Metric |
Baseline (est.) |
Target |
Kill Condition |
| Lesson 1 completion |
~45% |
60% |
Below 50% at Week 6 |
| Day 3 return rate |
~20% |
30% |
Below 22% at Week 8 |
| Day 7 retention |
~12% |
22% |
Below 15% at Week 10 |
| Placement abandonment |
~35% |
20% |
Above 40% at Week 4 |
| 30-day active rate |
~8% |
15% |
Below 10% at Week 12 |
Secondary signal: App Store review sentiment. Tracked qualitatively, not a hard target. If we're moving the numbers but reviews are still describing the same experience, something is off that the data isn't capturing.
In Scope
- Intent capture screen (one screen before placement)
- Adaptive placement quiz (5 questions, intent-matched)
- Intent-matched Lesson 1 question banks
- Skill-level result screen (replaces grade-level output)
Out of Scope
- Curriculum changes or new math verticals
- Evaluation or testing system (planned for Phase 2)
- Changes to the core lesson engine
- New infrastructure -- content is strings and config
Hard Guardrails
- Sign-up conversion must not drop by more than 3% -- we stop if it does
- All experiments use the existing lesson engine
- Content is config, not code -- engineering lift is minimal
Phase 2 -- Long-Term Bets (if Phase 1 validates)
Voice Mode
After a wrong answer, the user explains their reasoning out loud. The app corrects the misconception conversationally.
Camera Math
Point your phone at a bill, a spreadsheet, or a homework problem. The app turns it into a lesson.
Skill Map
After 2 weeks, the user sees a personalized map of where they're strong and where the gaps are.
Identity Shift
A 30-day arc that takes someone from "I'm not a math person" to "look what I can do now" -- celebrating skill milestones with identity-based language, not just XP.
Three parallel experiments. We run all three, read the signal, and go where the data leads.
Experiment 1
Intent Anchor
One screen before placement begins, asking "What's your goal?" This is the same onboarding pattern Language already uses -- applied to Math. Adults who name a goal before their first lesson have a clearer reason to stay.
Everyday life (bills, budgets, tips)
Work and career (spreadsheets, data, percentages)
Helping my kid with homework
Preparing for an exam
Hypothesis
Adults who pick a goal before their first lesson complete Lesson 1 at a higher rate than those who go straight to placement.
Measure
Lesson 1 completion rate, broken down by intent cohort vs. control group.
Kill Condition
No significant difference between cohorts and control at the 4-week mark.
Experiment 2
Adaptive Placement
Replace the current placement quiz with 5 real-world scenario questions, matched to the user's chosen intent. Replace the grade-level result ("Grade 5") with a skill-level result ("Level 3: Percentages & Everyday Math"). The level system runs from Level 1 (Foundations) to Level 6 (Advanced Applications).
Hypothesis
Skill-level framing reduces placement abandonment. Real-world scenarios place adults more accurately than abstract arithmetic.
Measure
Placement completion rate and Day 1 return rate, compared to current quiz.
Kill Condition
Placement abandonment increases vs. control at any checkpoint.
Experiment 3
Real World First
Redesign Lesson 1 so the very first question comes from the user's chosen intent -- not a generic arithmetic drill. Each intent path has its own 5-question Lesson 1 bank. The goal is for users to feel like the app was built for their situation from the first question.
Hypothesis
Feeling seen in question 1 drives stronger Day 3 return than a generic opening question.
Measure
Day 3 return rate by cohort, compared to control.
Kill Condition
No improvement at 8 weeks, or one cohort severely underperforms the others.
12 weeks, 4 phases. Small bets, clear checkpoints, no surprises.
Weeks 1-2
Align and build
Finalize question banks -- 5 questions for each of the 4 intents. Agree on level names and copy for Levels 1 through 6. Set up event tracking so we can measure each experiment cleanly. Ship the intent anchor to 10% of new adult users to get early signal before a wider rollout.
Weeks 3-4
Placement redesign goes live
The new 5-question adaptive placement quiz launches. The skill-level result screen replaces the grade-level output. We watch placement completion rate and Day 1 return rate closely -- these are the first real signals on whether the reframe is working.
Weeks 5-8
Lesson 1 reframe goes live
Intent-matched question banks go live for each cohort. The full loop is now running: pick a goal, take placement, get a matched first lesson, see a streak prompt. We review the data at Week 6 and decide whether to continue each experiment or adjust the approach.
Weeks 9-12
Read the data, decide
Identify which cohorts performed best. Revise any underperforming question banks. Ship the winning variant to all new adult users. Use what we learned to define Phase 2 scope.
6
06 -- Decision Framework
We don't know which experiment will win. Here's how we decide:
- Lesson 1 completion reaches 60% at Week 6 -- we keep the intent anchor and expand it to all new users.
- Placement abandonment stays high -- we shorten the quiz to 3 questions. We'll know by question which one is the drop-off point.
- One intent cohort clearly outperforms the others -- we double down on that path first and use it to inform the others.
- No metric moves by Week 8 -- we stop the experiments and run a qualitative research sprint instead. The problem may be something we haven't identified yet.
Risks
| Risk |
Likelihood |
Mitigation |
| Intent screen adds friction, fewer sign-ups |
Medium |
A/B test vs. full control group. Stop if conversion drops more than 3%. |
| 5-question placement feels too long |
Medium |
Track drop-off by question number. Shorten to 3 questions if we see a clear falloff. |
| Question bank quality is uneven across intents |
High |
Test with 5 adult users per intent path before launch. Fix before we ship. |
| Adults skip the intent screen |
Low |
Make selection required in V1. Test optional in V2 once we have baseline data. |
| 4 parallel content paths overwhelm engineering |
Medium |
All paths use the same lesson engine. Content is config, not code -- the engineering lift is minimal. |