A Comparison between Random Forest and Mixed Effects Random Forest to Predict Students' Math Performance in Indonesia
Keywords:Hierarchical structure, Mixed effects random forest, Random forest, Prediction accuracy
The Center for Assessment and Learning in Indonesia developed a national assessment system called the Indonesian Student Competency Assessment (AKSI/Asesesmen Kompetensi Siswa Indonesia) to measure the competence of school graduates. The results of AKSI 2019 showed that Indonesian students had poor math performance. However, the factors that affect student scores are not only caused by individual students but can also be influenced by the quality of the school. It indicates that the data has a hierarchical structure. Thus, this study aims to compare the Mixed effects random forest (MERF) method with random forest (RF) to predict students’ math performance on AKSI 2019. The data set in this study was divided into two sets, train (85% of the original data set) and test set (15% of the original data set). Furthermore, the MERF model was built using training data by estimating the fixed component (fixed part) and the random part effects on the model, while the RF ignores the random effects on the model. Then, estimation of the value of the response variable on the test data was carried out using the MERF model that has been built. These steps are repeated 10 times to give accurate results. The Model then evaluated by calculating the mean square error prediction (MSEP), root mean square error prediction (RMSEP), and mean absolute error prediction (MAEP) from 10 models. The results showed that the MERF model produced higher prediction accuracy than the RF model. This means that school variables have an effect in predicting student performance, which can increase the accuracy of predictions if the effects are included in the model.
. Puspendik. “AKSI”. Internet: pusmenjar.kemdikbud.go.id/aksi-2/, 2019 [Sep. 07, 2020].
. E. Sheen. “An Exploration of Mixed Effects Models for Analysis of Infant Weight Gain Trajectories.” M.S. thesis, Pennsylvania State University, Pennsylvania, 2019.
. D. Bates, M. Mächler, B. Bolker, S. Walker. (2015, Oct). “Fitting Linear Mixed-Effects Models using lme4.” Journal of Statistical Software. [On-line]. 67(1). Available: www.jstatsoft.org/v67/i01/ [Oct. 18, 2020].
. A. Groll, G. Tutz. (2014, Mar). “Variable selection for generalized linear mixed models by L 1-penalized estimation.” Statistics and Computing. [On-line]. 24(2), pp. 137-154. Available: link.springer.com/article/10.1007/s11222-012-9359-z [Oct. 18, 2020].
. A. Hajjem. “Mixed Effects Trees and Forests for Clustered Data.” Ph.D. dissertation, University of Montreal, Montreal, 2010.
. A. Hajjem, F. Bellavanve, D. Larocque. (Nov, 2012). “Mixed-effects random forest for clustered data.” Journal of Statistical Computation and Simulation, [On-line]. 84(6), pp. 1313-1328. Available: https://doi.org/10.1080/00949655.2012.741599.
. A. Fakhrurrozi. “On the Use of Mixed Effects Machine Learning Regression Models to Capture Spatial Patterns: A Case Study on Crime.” M.S. thesis, University of Twente, Enschede, 2019.
. L Breiman. (Aug, 2001). “Statistical modeling: The two cultures”. Statistical Science, [On-line]. 16(3), pp. 199–231. Available: https://doi.org/10.1214/ss/1009213726.
. C. Strobl, J. Malley, G. Tutz. (Dec, 2009). “An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests”. Psychol Methods, [On-line]. 14(p), pp. 323–348. Available: https://doi.org/10.1037/a0016973.
. OECD. “Education GPS”. Internet: http://gpseducation.oecd.org, n.d. [Mar. 13, 2021].
. F. Avvisati, A. Echazarra, P. Givord, M. Schwabe. “Indonesia - Country Note - PISA 2018 Result”. Internet: https://www.oecd.org/pisa/publications/PISA2018_CN_IDN.pdf, 2019 [Mar. 13, 2021].
How to Cite
Authors who submit papers with this journal agree to the following terms.