Random Forests for Poverty Classification

Authors

  • Ruben Thoplan Department of Economics and Statistics, Faculty of Social Studies and Humanities, University of Mauritius, R

Keywords:

data mining, classification, poverty, random forests, poverty-gender gap

Abstract

This paper applies a relatively novel method in data mining to address the issue of poverty classification in Mauritius. The random forests algorithm is applied to the census data in view of improving classification accuracy for poverty status. The analysis shows that the numbers of hours worked, age, education and sex are the most important variables in the classification of the poverty status of an individual. In addition, a clear poverty-gender gap is identified as women have higher chances to be classified as poor as compared to men.

Author Biography

Ruben Thoplan, Department of Economics and Statistics, Faculty of Social Studies and Humanities, University of Mauritius, R

Ruben Thoplan is a lecturer in Statistics at the University of Mauritius since year 2009. He is current research active in the field of data mining.

References

World_Bank, "The World Bank Working for a World Free of Poverty," 2014. [Online]. Available: http://www.worldbank.org/en/topic/poverty/overview. [Accessed 16 July 2014].

P. Olinto, K. Beegle, C. Sobrado and H. Uematsu, "The State of the Poor: Where Are The Poor, Where Is Extreme Poverty Harder to End, and What Is the Current Profile of the World

A. Banovcinova, J. Levicka and M. Veres, "The Impact of Poverty on the Family System Functioning," Procedia - Social and Behavioral Sciences, vol. 132, p. 148

E. O. Wright, "The Class Analysis of Poverty," International Journal of Health Services , vol. 25, no. 1, pp. 85 - 100 , 1995.

F. N. Stapleford, "Causes of Poverty," The Public Health Journal, vol. 10, no. 4, pp. 157-161, 1919.

S. J. Lipina and J. A. Colombo, Poverty and brain development during childhood: An approach from cognitive psychology and neuroscience., Washington, DC, US: American Psychological Association, 2009.

V. Barham, R. Boadway, M. Marchand and P. Pestieau, "Education and the poverty trap," European Economic Review, vol. 39, no. 7, p. 1257

C. Hokayem and M. L. Heggeness, "Living in Near Poverty in the United States:1966 - 2012," U.S. Census Bureau, 2014.

H. Bundhoo, "Poverty Analysis 2001/02," Central Statistics Office, Ministry of Finance and Economic Development, Port Louis, 2006.

R. Nisbet, J. Elder and G. Miner, Handbook of Statistical Analysis and Data Mining Applications, Academic Press, 2009.

L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.

G. Louppe, L. Wehenkel, A. Sutera and P. Geurts, "Understanding variable importances in forests of randomized trees," Electronic Proceedings, 2013.

C. Vickery, "The Time-Poor: A New Look at Poverty," The Journal of Human Resources, vol. 12, no. 1, pp. 22-48, 1977.

L. Breiman, "Out-of-Bag Estimation," Technical report, Statistics Department, University of California Berkeley, Berkeley CA 94708, pp. 1-13, 1996.

L. Breiman, "Manual on Setting Up, Using, And Understanding Random Forests V3.1," Technical Report, 2002.

M. Pal, "Random forest classifier for remote sensing classification," International Journal of Remote Sensing, vol. 26, no. 1, pp. 217-222, 2005.

J. Maindonald and W. J. Braun, Data Analysis and Graphics Using R: An Example-Based Approach, 3 ed., New York: Cambridge University Press, 2010.

M. Kuhn, "Variable Importance Using the Caret Package," 19 March 2012. [Online]. Available: http://www.icesi.edu.co/CRAN/web/packages/caret/vignettes/caretVarImp.pdf. [Accessed 21 July 2014].

Downloads

Published

2014-08-12

How to Cite

Thoplan, R. (2014). Random Forests for Poverty Classification. International Journal of Sciences: Basic and Applied Research (IJSBAR), 17(2), 252–259. Retrieved from https://gssrr.org/index.php/JournalOfBasicAndApplied/article/view/2574

Issue

Section

Articles