Copyright (c) 2024, ECOLS - All rights reserved.
The student Anuraag Raj wrote the code for BERT vs. Autoencoder in Analyzing Depression from Text Data, with Zain Ali contributing by writing the code for BERT and providing encoded vectors. The code was written in Python.
Paper: Depression Detection using BERT on Social Media Platforms. Paper presented at the IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET 2024).
Authors: Anuraag Raj, Zain Ali, Shonal Chaudhary, Kavitesh Bali and Anuraganand Sharma.
This repository contains code for detecting depression through social media posts using BERT for feature extraction and an autoencoder for dimensionality reduction. This project aims to identify signs of depression in text data by combining the strengths of BERT for capturing contextual information and autoencoders for reducing feature dimensionality.
- BERT Model: Utilizes a pre-trained BERT model for effective feature extraction from text data.
- Autoencoder: Reduces the dimensionality of extracted features to improve classification performance.
- Data Preprocessing: Includes steps for data cleaning, tokenization, and preparation.
- Model Training: Scripts for training the autoencoder and fine-tuning the BERT model.
- Evaluation: Performance metrics such as accuracy, precision, recall, and F1 score for model assessment.
- Python 3.x
- PyTorch
- Transformers (Hugging Face)
- Scikit-learn
- Matplotlib
- Clone the repository:
git clone https://github.com/anuraag165/Depression_BERT_Autoencoder.git
- Navigate to the project directory:
cd Depression_BERT_Autoencoder
- Install the required dependencies:
pip install -r requirements.txt
- Prepare your dataset and place it in the
data/
directory. The dataset should include:depression_dataset_reddit_cleaned.csv
: Original data retrieved from Kaggle.merged_tensors_with_labels.csv
: Encoded data using BERT.
- Run the training script located in the
py/
directory:python py/bert_v_autoencoder.py
To run the code in Google Colab:
-
Open the notebook file in the
ipynb/
folder:Colab_Notebook.ipynb
-
Follow these steps to set up the notebook in Colab:
- Upload the dataset files (
depression_dataset_reddit_cleaned.csv
andmerged_tensors_with_labels.csv
) to the Colab environment. - Ensure all required libraries are installed in the Colab environment by running:
!pip install torch transformers scikit-learn matplotlib
- Run the cells in the notebook to preprocess data, train models, and evaluate performance.
- Upload the dataset files (
This project uses the following datasets:
depression_dataset_reddit_cleaned.csv
: Original dataset retrieved from Kaggle.merged_tensors_with_labels.csv
: Encoded data using BERT.
This project is built upon the work of researchers and developers in the fields of NLP and deep learning, particularly those who developed BERT and autoencoders.
- Anuraag Raj
- Zain Ali