Final Project | Shane Staret

Link to final project video

Project Overview

The primary problem presented through this dataset is one involving the prediction of student performance in secondary school based on data gathered on several hundred students. With 32 input variables, this is a high-dimensional dataset. The idea is to predict which of these variables, if any, can help us predict how well a student may perform academically. This is arguably a very important problem to look into, as determining variables that contribute to student achievement or failure can influence methods currently used within schooling to help students succeed. It can also help to identify variables that may irrelevant, allowing those focused on increasing student achievement to ignore these variables and focus on the impactful ones.

This dataset displays student achievement in secondary education of two Portuguese schools. The data was collected using school reports and questionnaires. There are 33 total data characteristics (1 being the target value, which is the final grade of each student). The data attributes include student grades, demographic, social and school related features. Two datasets are provided regarding the performance in two distinct subjects: Mathematics and Portuguese language. The two datasets were modeled under binary/five-level classification and regression tasks. The idea is to predict student performance based on the values of 32 attributes. This data is directly from the University of Minho and is hosted by the ICS School in the University of California, Irvine (UCI).

Challenges

Problems presented themselves when attempting to model the data, specifically when trying to use a Keras NN method. No matter the activation function, the optimizer, the number of hidden nodes, the batch size, batch size, or number of folds, the Keras NN model never performed better than the multiple linear regression model. Perhaps there are issues with how the Keras NN model was set up, however, I have exhaustedly experimented with this model with minimal improvement.

Conclusions

Overall, it appears that the conclusions in the research paper that evaluated this dataset are very similar to the conclusions of this project. Unless the first and second period grades are included, it is difficult to generate any meaningful predictions or a list of variables that are relevant/irrelevant to a student's final scores. In other words, previous student performance is the best predictor of future student importance. Of course, there are other variables that contribute negatively (previous failures, high weekday drinking, etc) and positively (good family relationship, school support, etc) but their influence appears to not be as great as the previous performance variables (G1 & G2).

While not many strong relationships could be generated between the input variables and student performance, many irrelevant variables were found. These results could be used to focus on specific variables that have appear to have relevant influence on student performance, while putting less emphasis on attributes that appear to not have much effect.

Important Resources

Link to dataset

Link to research paper using this dataset

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
DataPrep_EDA.ipynb		DataPrep_EDA.ipynb
Final_Report.ipynb		Final_Report.ipynb
Modeling.ipynb		Modeling.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Final Project | Shane Staret

Link to final project video

Project Overview

Challenges

Conclusions

Important Resources

Link to dataset

Link to research paper using this dataset

About

Releases

Packages

Languages

shane-staret/Student-Peformance-Prediction-Bucknell-CSCI-349

Folders and files

Latest commit

History

Repository files navigation

Final Project | Shane Staret

Link to final project video

Project Overview

Challenges

Conclusions

Important Resources

Link to dataset

Link to research paper using this dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages