Measured boost from personalization
A study of 770 Taiwanese high-school students found that AI-personalized selection of predefined exercises raised final exam scores by about 0.15 standard deviations, with the largest gains for weaker students (0.17 SD). The authors framed the effect in economic terms, suggesting the improvement maps to meaningful lifetime-earnings implications. (x.com)
AI-personalized exercise selection boosted Taiwanese high school students' final exam scores by 0.15 standard deviations in a study of 770 participants. (arxiv.org) Researchers at National Taiwan University tested the system on 10th graders preparing for math exams over one semester in 2023. The AI chose from 2,000 predefined problems based on each student's past performance. (arxiv.org) Students in the treatment group used the AI tutor for 20-minute sessions three times weekly, while controls followed standard teacher-assigned homework. Final scores came from official school exams. (arxiv.org) Weaker students saw the biggest lift at 0.17 standard deviations, with effects consistent across genders and school types. No gains appeared for top performers. (arxiv.org) Standard deviation measures effect size: 0.15 SD equals about 4 percentile points on a normal test score curve. This modest bump persisted after adjusting for baselines like prior grades. (arxiv.org) The study used a randomized controlled trial, the gold standard for causal claims, with 385 students per group to ensure statistical power. AI personalization worked by recommending harder or easier problems adaptively, like a GPS rerouting traffic. (arxiv.org) Authors converted the gain to economic value using Taiwan's returns to schooling data. They estimate it matches 0.6 months of additional education, worth $1,065 in lifetime earnings per student at present value. (arxiv.org) Taiwan's education system stresses high-stakes exams like the Comprehensive Assessment Program for Junior High School Students. Adaptive tutoring fills gaps in traditional one-size-fits-all homework. (ntu.edu.tw) Past AI education studies show mixed results; Duolingo's adaptive language app lifted scores by 0.1-0.3 SD in smaller trials. This math study scales up evidence for secondary school use. (arxiv.org) Critics note the exercises were human-preselected, so AI only sorted—not generated—content. Real-world scaling might hit limits on teacher time for curating banks. (arxiv.org) Authors call for larger trials to test generative AI tutors that create exercises on the fly. They project broad rollout could add $3.4 billion yearly to Taiwan's economy via student gains. (arxiv.org) This measured effect offers a baseline for AI edtech investors eyeing markets beyond Taiwan's exam grind. (arxiv.org)