A Review on Prediction of Academic Performance of Students at-Risk Using Data Mining Techniques

  • Preet Kamal CURIN, Chitkara University, Punjab, India
  • Sachin Ahuja sachin.ahuja@chitkara.edu.in
Keywords: Data Mining (DM), Educational data mining (EDM), Education System


Educational data mining is the procedure of converting raw data collected from educational databases into some useful information. It can be helpful in designing and answering research questions like performance prediction of students in academics, factors that affect the students’ performance, help the teachers in understanding the problems faced by the students to understand the course content and complexity of the subject taken so that the teachers can take timely action to control the dropout rate. This also includes improving the teaching learning process so that the interventions can be taken at the right time to improve the performance of the student. This paper is the review of the research work done in the field of educational data mining for the prediction of students’ performance. The factors that influence the performance of the students i.e. the type of classrooms they attend such as traditional or on-line, socio-economic, educational background of the family, attitude toward studies and challenges faced by the students during course progress. These factors leads to the categorization of the students into three groups “Low-Risk”: who have High probability of succeeding, “Medium-Risk”: who may succeed in their examination, “High-Risk”: who have High probability of failing or drop-out. It elaborates the different ways to improve the teaching learning process by providing the students personal assistance, notes, class-assignments and special class tests. The most efficient techniques that are used in educational data mining are also reviewed such as; classification, regression, clustering and and prediction.


Download data is not yet available.


Razzaq, Leena, and Neil T. Heffernan. “Scaffolding vs. hints in the Assistment System” International Conference on Intelligent Tutoring Systems. Springer Berlin Heidelberg, 2006.

Han, J., Kamber, M. (2006). Data Mining: Concepts and Techniques, Morgan Kaufmann Publisher.

Feng, Mingyu, and Neil T. Heffernan. “Informing teachers live about student learning: Reporting in the assistment system” Technology Instruction Cognition and Learning 3.1/2 pp. 63 (2006).

Tahir, Syed, and S. R. Naqvi. “FACTORS AFFECTING STUDENTS’PERFORMANCE.” Bangladesh e-journal of sociology 3, no. 1 pp. 2 (2006).

Mierswa, Ingo, Michael Wurst, Ralf Klinkenberg, Martin Scholz, and Timm Euler. “Yale: Rapid prototyping for complex data mining tasks.” In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 935-940. ACM, 2006.

Nghe, Nguyen Thai, Paul Janecek, and Peter Haddawy. “A comparative analysis of techniques for predicting academic performance.” In 2007 37th Annual Frontiers In Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports, pp. T2G-7. IEEE, 2007. http://dx.doi.org/10.1109/FIE.2007.4417993.

Superby, Juan-Francisco, J. P. Vandamme, and N. Meskens. “Determination of factors influencing the achievement of the first-year university students using data mining methods.” In Workshop on Educational Data Mining, vol. 32, p. 234. 2006.

Romero, Cristóbal, Sebastián Ventura, Pedro G. Espejo, and César Hervás. “Data mining algorithms to classify students.” In Educational Data Mining 2008.

Feng, Mingyu, Joseph Beck, Neil Heffernan, and Kenneth Koedinger. “Can an Intelligent Tutoring System Predict Math Proficiency as Well as a Standardized Test?.” In Educational Data Mining 2008.

Antunes, Cláudia. “Acquiring background knowledge for intelligent tutoring systems.” In Educational Data Mining 2008.

Amershi, Saleema, and Cristina Conati. “Combining Unsupervised and Supervised Classification to Build User Models for Exploratory.” JEDM-Journal of Educational Data Mining 1, no. 1 pp.18-71 (2009).

Baker, Ryan SJD, and Kalina Yacef. “The state of educational data mining in 2009: A review and future visions.” JEDM-Journal of Educational Data Mining 1, no. 1 pp. 3-17 (2009).

Ayers, Elizabeth, Rebecca Nugent, and Nema Dean. “A Comparison of Student Skill Knowledge Estimates.” International Working Group on Educational Data Mining (2009).

Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. “The WEKA data mining software: an update.” ACM SIGKDD explorations newsletter 11, no. 1 pp. 10-18 (2009).

Barnes, T., M. Desmarais, C. Romero, and S. Ventura. “Educational Data Mining 2009: 2nd International Conference on Educational Data Mining.” Proceedings Cordoba, Spain (2009).

Romero, Cristóbal, and Sebastián Ventura. “Educational data mining: a review of the state of the art.” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40, no. 6 pp. 601-618 (2010).

Razzaq, Leena, Jozsef Patvarczki, Shane F. Almeida, Manasi Vartak, Mingyu Feng, Neil T. Heffernan, and Kenneth R. Koedinger. “The Assistment Builder: Supporting the life cycle of tutoring system content creation.” IEEE Transactions on Learning Technologies 2, no. 2 pp. 157-166. (2009). http://dx.doi.org/10.1109/TLT.2009.23.

Ramaswami, M., and R. Bhaskaran. “A CHAID based performance prediction model in educational data mining.” arXiv preprint arXiv:1002.1144(2010).

Kumar, S. Anupama, and M. N. Vijayalakshmi. “Efficiency of decision trees in predicting students’ academic performance.” In First International Conference, on Computer Science, Engineering and Applications, CS and IT, vol. 2, pp. 335-343. (2011).

Kumar, Varun, and Anupama Chadha. “An empirical study of the applications of data mining techniques in higher education.” International Journal of Advanced Computer Science and Applications 2, no. 3 (2011). http://dx.doi.org/10.14569/IJACSA.2011.020314.

Shih, Benjamin, Kenneth R. Koedinger, and Richard Scheines. “A response time model for bottom-out hints as worked examples.” Handbook of educational data mining pp. 201-212 (2011).

Yadav, Surjeet Kumar, Brijesh Bharadwaj, and Saurabh Pal. “Data mining applications: A comparative study for predicting students’ performance.” arXiv preprint arXiv:1202.4815 (2012).

Yadav, Surjeet Kumar, Brijesh Bharadwaj, and Saurabh Pal. “Data mining applications: A comparative study for predicting students’ performance.” arXiv preprint arXiv:1202.4815 (2012).

Kabakchieva, Dorina. “Student performance prediction by using data mining classification algorithms.” International Journal of Computer Science and Management Research 1, no. 4 pp. 686-690 (2012).

Tair, Mohammad M. Abu, and Alaa M. El-Halees. “Mining educational data to improve students’ performance: a case study.” International Journal of Informational 2, no.2 (2012).

Asiri, Mahdi, and Behrouz Minaei. “Predicting GPA and academic dismissal in LMS using educational data mining: A case mining.” In 6th National and 3rd International conference of e-Learning and e-Teaching, pp. 53-58. IEEE, 2012.

Wolff, Annika, Zdenek Zdrahal, Andriy Nikolov, and Michal Pantucek. “Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment.” In Proceedings of the third international conference on learning analytics and knowledge, pp. 145-149. ACM, 2013.

Demšar, Janez, and Blaž Zupan. “Orange: Data Mining Fruitful and Fun-A Historical Perspective.” Informatica 37, no. 1 (2013).

Wolff, Annika, Zdenek Zdrahal, Drahomira Herrmannova, and Petr Knoth. “Predicting student performance from combined data sources.” In Educational Data Mining, pp. 175-202. Springer International Publishing, 2014.

Pandey, Mrinal, and Vivek Kumar Sharma. “A decision tree algorithm pertaining to the student performance analysis and prediction.” International Journal of Computer Applications 61, no. 13 (2013).

Huang, Shaobo, and Ning Fang. “Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models.” Computers & Education 61 pp. 133-145. (2013). https://doi.org/10.1016/j.compedu.2012.08.015.

Romero, Cristóbal, Manuel-Ignacio López, Jose-María Luna, and Sebastián Ventura. “Predicting students’ final performance from participation in on-line discussion forums.” Computers & Education 68 pp. 458-472. (2013). https://doi.org/10.1016/j.compedu.2013.06.009.

Marquez-Vera, Carlos, Cristóbal Romero Morales, and Sebastián Ventura Soto. “Predicting school failure and dropout by using data mining techniques.” IEEE Revista Iberoamericana de Tecnologias del Aprendizaje8, no. 1 pp. 7-14. (2013). https://doi.org/10.1109/RITA.2013.2244695.

Hlosta, Martin, Drahomira Herrmannova, Lucie Vachova, Jakub Kuzilek, Zdenek Zdrahal, and Annika Wolff. “Modelling student online behaviour in a virtual learning environment.” (2014).

Wolff, Annika, Zdenek Zdrahal, Drahomira Herrmannova, Jakub Kuzilek, and Martin Hlosta. “Developing predictive models for early detection of at-risk students on distance learning modules.” (2014).

Patil, Priyanka Anandrao, and R. V. Mane. “Prediction of Students Performance Using Frequent Pattern Tree.” In Computational Intelligence and Communication Networks (CICN), 2014 International Conference on, pp. 1078-1082. IEEE, 2014.

Ahmed, Abeer Badr El-Din, and Ibrahim Sayed Elaraby. “Data Mining: A prediction for Students’ Performance Using Classification Method.” World Journal of Computer Application and Technology 2, no. 2 pp. 43-47 (2014).

Kuzilek, Jakub, Martin Hlosta, Drahomira Herrmannova, Zdenek Zdrahal, and Annika Wolff. “OU Analyse: analysing at-risk students at The Open University.” Learning Analytics Review pp. 1-16 (2015).

Ahmad, Fadhilah, Nur Hafieza Ismail, and Azwa Abdul Aziz. “The Prediction of Students’ Academic Performance Using Classification Data Mining Techniques.” Applied Mathematical Sciences 9, no.129 pp. 6415-6426. (2015).

Saxena, Ritika. “Educational Data Mining: Performance Evaluation of Decision Tree and Clustering Techniques using WEKA Platform. “” International Journal of Computer Science and Business Informatics (2015).

Deshpande, Akshay, Prashant Pimpare, Shashank Bhujbal, Abhishek Kommwar, and Jagruti Wagh. “Student Performance Analysis, Visualization and Prediction Using Data Mining Techniques.” Imperial Journal of Interdisciplinary Research 2, no. 5 (2016).

Puyalnithi, Thendral, V. Madhu Viswanatham, and Ashmeet Singh. “Comparison of Performance of Various Data Classification Algorithms with Ensemble Methods Using RAPIDA MINER.” International Journal 6, no. 5 (2016).

Rana, Shiwani, and Roopali Garg. “Evaluation of Students’ Performance of an Institute Using Clustering Algorithms.” International Journal of Applied Engineering Research 11, no. 5 pp. 3605-3609. (2016).

Ostrow, Korinn S., and Neil T. Heffernan. “Studying Learning at Scale with the ASSISTments TestBed.” In Proceedings of the Third (2016) ACM Conference on Learning@ Scale, pp. 333-334. ACM, 2016.

Asif, R., Haider, N. G., & Ali, S. A. (2016). Prediction of UndergraduateStudents’ Performance using Data Mining Methods. International Journal of Computer Science and Information Security, 14(5), 374.

Kumar, M., Shambhu, S., & Aggarwal, P. (2016). Recognition of Slow Learners Using Classification Data Mining Techniques. Imperial Journal of Interdisciplinary Research, 2(12).

Ferrara, S., & Way, D. (2016). 2 Design and Development of End-of-Course Tests for Student Assessment and Teacher Evaluation. In Meeting the Challenges to Measurement in an Era of Accountability (p. 11). Routledge.

Alcala-Fdez, J., Garcia, S., Fernandez, A., Luengo, J., Gonzalez, S., Saez, J. A., ... & Herrera, F. (2016). Comparison of KEEL versus open source Data Mining tools: Knime and Weka software.

Read, J., Reutemann, P., Pfahringer, B., & Holmes, G. (2016). Meka: a multi-label/multi-target extension to weka. Journal of Machine Learning Research, 17(21), 1-5.

Ostrow, K. S., & Heffernan, N. T. (2016, April). Studying Learning at Scale with the ASSISTments TestBed. In Proceedings of the Third (2016) ACM Conference on Learning@ Scale (pp. 333-334). ACM.

Atinaf, W., & Petros, P. (2016). Socio-Economic Factors Affecting Female Students Academic Performance at Higher Education. Health Care: Current Reviews, 4(163), 2.

Feng, M., & Roschelle, J. (2016, April). Predicting Students’ Standardized Test Scores Using Online Homework. In Proceedings of the Third (2016) ACM Conference on Learning@ Scale (pp. 213-216). ACM.

How to Cite
Preet Kamal, & Sachin Ahuja. (2017). A Review on Prediction of Academic Performance of Students at-Risk Using Data Mining Techniques. Journal on Today’s Ideas - Tomorrow’s Technologies, 5(1), 30-39. https://doi.org/10.15415/jotitt.2017.51002