Salary Prediction using Machine Learning and Feature Engineering

Salary Prediction Project

Companies like Glassdoor and Paysa are popping up which give potential employees and recruiters access to all sorts of information regarding the profile of a company. These companies provide information ranging from reviews, to jobs and also salaries for said jobs that a given company might have posted.

Since not all employees are not willing to share how much they really make and those that do, do so because of anonymity, we are unable to really see what companies pay with regards to the jobs they hire for.

The problem that arises now is missing data regarding salaries though we do have some insight as to what companies hire for.

We can tackle this problem by using Machine Learning and the power of prediction that comes with Machine Learning to estimate salaries for the jobs we dont have any salary data for.

Here are the files that might be of importance to you...

The jupyter notebook called exploratory_data_analysis.ipynb is for the EDA and to explore the data to see what the distributions look like and how the data is structured.

The notebook called 'Modeling' is a breakdown of the script.py file and executed in chunks to show the progress at each step.

(These two files reside in the notebooks folder.)

The script.py file is a script that contains the entire code for the Modeling.py file. This script cleans the data, encodes it, standardizes it, creates 4 different models and outputs the results and feature importances to a txt and csv respectively. (This file resides in the src folder, here you can see all the source code)

All the charts you see in the EDA notebook can be found in the reports section.

The feature importances of the data as ranked by the model can be found in the feature importance csv file in the root folder, as well as can be found as an image with filename feature_importances.png.

The predicted salaries are in a csv file in the root folder with the name predictions_salaries.csv

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
img		img
models		models
notebooks		notebooks
reports		reports
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
feature_importances.png		feature_importances.png
predictions_salaries.csv		predictions_salaries.csv
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salary Prediction using Machine Learning and Feature Engineering

About

Releases

Packages

Languages

License

divyamb/salary_prediction

Folders and files

Latest commit

History

Repository files navigation

Salary Prediction using Machine Learning and Feature Engineering

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages