Fair Compensation

A showcase for using machine learning in human resources.

Project Goal

The goal of the project is to showcase how machine learning can be used to make the salary structure fair. This may involve, for example, identifying and eliminating potential gender pay disparities, making pay increases or new hires fair.

A live demo of the model in action can be visited on ijusttyped.github.io/fair-compensation-frontend.

Project Setup

Project Structure

The project is structured as follows:

├── .dvc                    <- Metadata for DVC.
├── .github                 <- Github actions for CI/CD.
├── artefacts               <- Artefacts produced when executing the model training.
├── data
│   └── interim             <- Intermediate data that has been transformed.
├── src
│   ├── api                 <- Package to provide projects functionality as callable API.
│   ├── data_loading        <- Functionality to load data.
│   ├── modelling           <- Functionality to build and use ML models.
│   ├── preprocessing       <- Functionality to preprocess data for modelling.
│   └── utils               <- Common functions used throughout the project.
├── docker-compose.yaml     <- Docker compose for testing the API's locally.
├── Dockerfile              <- File to build the docker container for the API.
├── dvc.yaml                <- DVC pipeline to execute the model training.
├── LICENSE
├── README.md               <- The top-level README for developers using this project.
└── requirements.txt        <- The project dependencies.

Environment Setup

The recommended way of installing the project's dependencies is to use a virtual environment. To create a virtual environment, run the venv module inside the repository:

python3 -m venv venv

Once you have created a virtual environment, you may activate it by running: source ./venv/bin/activate

To install the dependencies, run: python3 -m pip install -r requirements.txt

For code execution, make sure that the src directory is part of your PYTHONPATH: export PYTHONPATH=$PWD:$PWD/src/

Reproducing the Results

Raw Data

The data used in the project can be downloaded on Kaggle. The data has to be placed in the data/raw folder.

Data Versioning

We use DVC to version the raw data, interim results and artefacts.

Model Training Pipeline

We use the pipelining functionality of DVC to streamline the model training. The pipeline is structured as follows:

                  +--------------+                   
                  | data/raw.dvc |                   
                  +--------------+                   
                          *                          
                          *                          
                          *                          
                      +------+                       
                      | load |                       
                    **+------+**                     
                 ***            ***                  
               **                  **                
             **                      **              
  +----------------+            +---------------+    
  | clean-features |            | clean-targets |    
  +----------------+            +---------------+    
           *                            *            
           *                            *            
           *                            *            
+--------------------+        +-------------------+  
| transform-features |        | transform-targets |  
+--------------------+        +-------------------+  
                 ***            ***                  
                    **        **                     
                      **    **                       
                     +-------+                       
                     | train |                       
                     +-------+

To reproduce the results, run:

dvc repro

To show the resulting model metrics, run:

dvc metrics schow

Using the API

Local Execution

To test the API locally, you can spin up the container by running:

docker-compose up

The documentation of the API can than be seen on localhost:8000/docs.

Live Endpoint

The API is hosted on Render. The documentation of the live API can be seen on:

https://fair-compensation-backend.onrender.com/docs

NOTE: The service is shut down automatically to save resources, when it's not used for some time. It might take some time to start the service again once you call the link.

Additional Information

Please be aware that this project serves as a showcase. For additional information, suggestions for improvement or collaboration feel free to contact me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fair Compensation

Project Goal

Project Setup

Project Structure

Environment Setup

Reproducing the Results

Raw Data

Data Versioning

Model Training Pipeline

Using the API

Local Execution

Live Endpoint

Additional Information

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.dvc		.dvc
.github/workflows		.github/workflows
artefacts		artefacts
data		data
src		src
test		test
.coveragerc		.coveragerc
.dvcignore		.dvcignore
.gitignore		.gitignore
.pylintrc		.pylintrc
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt

License

Ijusttyped/fair-compensation-backend

Folders and files

Latest commit

History

Repository files navigation

Fair Compensation

Project Goal

Project Setup

Project Structure

Environment Setup

Reproducing the Results

Raw Data

Data Versioning

Model Training Pipeline

Using the API

Local Execution

Live Endpoint

Additional Information

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages