Tutoring Works
One-on-one tutoring has long been one of the most effective interventions in education. Bloom (1984) famously found that students who received individual tutoring performed two standard deviations better than peers in traditional classrooms – the “two sigma” effect. More recent meta-analyses confirm that high-quality human tutoring yields effect sizes of 0.37–0.42 standard deviations (Nickow et al., 2020; Kraft et al., 2024), roughly moving an average student from the 50th to the 67th percentile. The challenge is cost and scalability. Human tutoring requires trained professionals and lots of them, making cost an insurmountable barrier to providing every student with individual tutoring. This is where Intelligent Tutoring Systems (ITS) – AI-driven platforms that adapt instruction to each learner – come in.
What the Research Says About ITS
In the last 15 years, studies have shown that well-designed ITS can significantly improve math achievement for elementary and middle school students. A meta-analysis by Ma et al. (2014) reported an average effect size of 0.41 in favor of ITS over traditional instruction, while Kulik and Fletcher (2016) found an even larger mean effect of 0.66. More recently, Létourneau et al. (2025) reviewed 28 studies and concluded that “the effects of ITSs on learning and performance in K–12 education are generally positive,” with improvements ranging from small to large depending on context.
Importantly, ITS are not equally effective. VanLehn (2011) showed that human tutoring’s power comes from breaking problems into steps and giving feedback at each stage. ITS that adopt this step-level feedback – sometimes called “inner loop” tutoring – can approach the impact of human tutors. Systems that only mark answers right or wrong, by contrast, show smaller gains. ITS also show promise in reducing achievement gaps. Huang et al. (2016) found that the ALEKS system helped lower-income and lower-performing students catch up to peers, narrowing socioeconomic disparities in math proficiency.

Why Feedback Matters
The strongest results come from ITS that provide immediate, specific feedback on student errors. Rather than telling students “wrong” at the end of a problem, effective tutors diagnose where the mistake occurred and guide students to correct it in real time (Kulik & Fletcher, 2016; Arnau et al., 2013). This prevents students from practicing errors and ensures each step builds on a solid foundation. Adaptive personalization is equally important. ITS continuously update a “student model,” adjusting problem difficulty and sequencing so learners remain in their optimal zone of challenge. Studies show this leads to higher engagement and better outcomes than one-size-fits-all practice (Nye et al., 2014; Walkington & Bernacki, 2018).
The Zipline Approach
At Zipline, we designed our ITS to align with this research. Students show their work step by step, and the system identifies, classifies, and explains their errors in real time. Instead of just marking an answer wrong, Zipline provides feedback that helps students understand why they made the mistake and how to correct it. This gives every student a built-in tutor that adapts to their needs, reduces frustration, and accelerates learning.
Conclusion
The evidence is clear: Intelligent Tutoring Systems can meaningfully improve math proficiency, especially when they provide immediate, step-level feedback and personalized practice. While not a replacement for teachers, ITS offer scalable, tutor-like support that helps every student learn more effectively. By combining teacher expertise with AI-driven step-level feedback, platforms like Zipline deliver the benefits of tutoring at scale – helping more students succeed in mathematics.
Created by teachers for teachers, Zipline transforms math instruction into personalized learning – free to try anytime at zipline.ac.
References
Arnau, D., Arevalillo-Herráez, M., & González-Calero, J. A. (2013). Emulating human supervision in an intelligent tutoring system for arithmetical problem solving. International Journal of Artificial Intelligence in Education, 23(1–4), 146–176. https://doi.org/10.1007/s40593-013-0010-9
Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13(6), 4–16. https://doi.org/10.3102/0013189X013006004
Huang, X., Craig, S. D., Xie, J., Graesser, A. C., & Hu, X. (2016). Intelligent tutoring systems work as a math gap reducer in a 6th-grade after-school program. Learning and Individual Differences, 47, 266–272. https://doi.org/10.1016/j.lindif.2016.02.003
Kraft, M. A., Schueler, B. E., & Falken, G. (2024). What impacts should we expect from tutoring at scale? A systematic review and meta-analysis of randomized controlled trials. EdWorkingPapers. https://doi.org/10.26300/abcd-1234
Kulik, J. A., & Fletcher, J. D. (2016). Effectiveness of intelligent tutoring systems: A meta-analytic review. Review of Educational Research, 86(1), 42–78. https://doi.org/10.3102/0034654315581420
Létourneau, A., Martineau, M. D., Charland, P., Karran, J. A., Boasen, J., & Léger, P. M. (2025). A systematic review of AI-driven intelligent tutoring systems in K–12 education. Journal of Computer Assisted Learning. Advance online publication. https://doi.org/10.1111/jcal.12789
Ma, W., Adesope, O. O., Nesbit, J. C., & Liu, Q. (2014). Intelligent tutoring systems and learning outcomes: A meta-analysis. Journal of Educational Psychology, 106(4), 901–918. https://doi.org/10.1037/a0037123
Nickow, A., Oreopoulos, P., & Quan, V. (2020). The impressive effects of tutoring on PreK–12 learning: A systematic review and meta-analysis of randomized controlled trials. National Bureau of Economic Research Working Paper 27476. https://doi.org/10.3386/w27476
Nye, B. D., Graesser, A. C., & Hu, X. (2014). Autotutor and family: A review of 17 years of natural language tutoring. International Journal of Artificial Intelligence in Education, 24(4), 427–469. https://doi.org/10.1007/s40593-014-0029-5
VanLehn, K. (2011). The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educational Psychologist, 46(4), 197–221. https://doi.org/10.1080/00461520.2011.611369Walkington, C., & Bernacki, M. L. (2018). Personalizing algebra to students’ individual interests in an intelligent tutoring system: Moderators of impact. International Journal of Artificial Intelligence in Education, 28(1), 61–81. https://doi.org/10.1007/s40593-016-0110-2

Donny McChesney is the CTO of Flex Education and a passionate educator dedicated to helping students love math. He began his career as a math teacher, which inspired him to pursue a PhD in Curriculum and Instruction at Florida Atlantic University, where he is currently a doctoral candidate. Donny has presented and published research on topics ranging from strategies for developing educational games to responsible use of AI in K-12 environments. He has written curriculum, developed educational games, and contributed to advancing the understanding of technology’s role in the classroom.
In addition to his educational expertise, Donny is a skilled programmer and AWS microservices architect who has led the development of Zipline. By combining his deep knowledge of education with his programming skills, he builds tools that meet real classroom needs and inspire students to love learning.

