Skip to content

Latest commit

 

History

History
54 lines (33 loc) · 1.68 KB

File metadata and controls

54 lines (33 loc) · 1.68 KB

Job_Ad_Detection using Random Tree

We are going to predict the fake jobs from actual jobs for this we are using random tree.

Dataset we are going to use

Dataset used can be dowloaded from here

About Dataset

[Real or Fake] : Fake Job Description Prediction This dataset contains 18K job descriptions out of which about 800 are fake. The data consists of both textual information and meta-information about the jobs. The dataset can be used to create classification models which can learn the job descriptions which are fraudulent.

Acknowledgements The University of the Aegean | Laboratory of Information & Communication Systems Security http://emscad.samos.aegean.gr/

Inspiration The dataset is very valuable as it can be used to answer the following questions:

Create a classification model that uses text data features and meta-features and predict which job description are fraudulent or real. Identify key traits/features (words, entities, phrases) of job descriptions which are fraudulent in nature. Run a contextual embedding model to identify the most similar job descriptions. Perform Exploratory Data Analysis on the dataset to identify interesting insights from this dataset.

Setup instructions


Prerequisites

Experience with jupyter notebook or google colab.

Knowledge of following is recomended:

pandas

matplotlib

keras

Steps followed

Step 1. Data Preprocessing.

Step 2. Data visualization of data.

Step 3. Lementization

Step 4. Applying Random Tree.

Conclusion

  • Random Forest gives the best accuray that is 96%.

Author

Paritosh Tripathi