BERT vs. Autoencoder in Analyzing Depression from Text Data

Version 1.0

The student Anuraag Raj wrote the code for BERT vs. Autoencoder in Analyzing Depression from Text Data, with Zain Ali contributing by writing the code for BERT and providing encoded vectors. The code was written in Python.

Paper: Depression Detection using BERT on Social Media Platforms. Paper presented at the IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET 2024).
Authors: Anuraag Raj, Zain Ali, Shonal Chaudhary, Kavitesh Bali and Anuraganand Sharma.

Overview

This repository contains code for detecting depression through social media posts using BERT for feature extraction and an autoencoder for dimensionality reduction. This project aims to identify signs of depression in text data by combining the strengths of BERT for capturing contextual information and autoencoders for reducing feature dimensionality.

Key Features

BERT Model: Utilizes a pre-trained BERT model for effective feature extraction from text data.
Autoencoder: Reduces the dimensionality of extracted features to improve classification performance.
Data Preprocessing: Includes steps for data cleaning, tokenization, and preparation.
Model Training: Scripts for training the autoencoder and fine-tuning the BERT model.
Evaluation: Performance metrics such as accuracy, precision, recall, and F1 score for model assessment.

Requirements

Python 3.x
PyTorch
Transformers (Hugging Face)
Scikit-learn
Matplotlib

Installation

Clone the repository:

git clone https://github.com/anuraag165/Depression_BERT_Autoencoder.git

Navigate to the project directory:
```
cd Depression_BERT_Autoencoder
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Prepare your dataset and place it in the data/ directory. The dataset should include:
- depression_dataset_reddit_cleaned.csv: Original data retrieved from Kaggle.
- merged_tensors_with_labels.csv: Encoded data using BERT.
Run the training script located in the py/ directory:
```
python py/bert_v_autoencoder.py
```

Google Colab Integration

To run the code in Google Colab:

Open the notebook file in the ipynb/ folder:
- Colab_Notebook.ipynb
Follow these steps to set up the notebook in Colab:
- Upload the dataset files (depression_dataset_reddit_cleaned.csv and merged_tensors_with_labels.csv) to the Colab environment.
- Ensure all required libraries are installed in the Colab environment by running:
```
!pip install torch transformers scikit-learn matplotlib
```
- Run the cells in the notebook to preprocess data, train models, and evaluate performance.

Datasets

This project uses the following datasets:

depression_dataset_reddit_cleaned.csv: Original dataset retrieved from Kaggle.
merged_tensors_with_labels.csv: Encoded data using BERT.

Acknowledgments

This project is built upon the work of researchers and developers in the fields of NLP and deep learning, particularly those who developed BERT and autoencoders.

Programmer

Anuraag Raj
Zain Ali

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
ipynb		ipynb
py		py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BERT vs. Autoencoder in Analyzing Depression from Text Data

Version 1.0

Overview

Key Features

Requirements

Installation

Usage

Google Colab Integration

Datasets

Acknowledgments

Programmer

About

Releases

Packages

Languages

ECOLS-research-group/Depression_BERT_Autoencoder

Folders and files

Latest commit

History

Repository files navigation

BERT vs. Autoencoder in Analyzing Depression from Text Data

Version 1.0

Overview

Key Features

Requirements

Installation

Usage

Google Colab Integration

Datasets

Acknowledgments

Programmer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages