Random Forest to Predict Students' Performance (in progress...)

  1 min read  

Can we predict a student’s performance based on their lifestyle?

As an Economist, I have seen many papers analyzing students’ performance based on exclusively economic and social indicators, but it was the first example that I analyzed that included data regarding their lifestyle such as alcohol consumption and relationship status. The data was obtained through schools reports of approximately 700 high school students in Portugal, and can be found on the UCI Repository. All this data holds valuable information, such as trends and patterns, which can be used to improve decision making and optimize success.

The goal of this project is to create an easy to use tool for decision-makers to simulate if a student would pass or fail based on the student’s characteristics and provide advanced help for them before the exam.

As despicted in the figure below, 85% of students pass, it means, had grades equal or higher than 10. Here, we are interested in investigating the factors that may be leading the other 15% students to fail.

[Figure of grades distribution]


For this project, since the objetive is to identify students that might fail the school, I’m focusing on increasing the recall for the class “fail”.



Conclusions & Next Steps