This repository provides the codes and datasets used in the thesis: Training Daily Activity Classifiers for Healthcare with Semantic Similarity.
codes: the codes for running the models to generate predictions results.
datasets: the data used for buiding BOW systems and training BERT-based systems and the predictions of each systems of the project.
Prompt engineering: the pipeline of using GPT-3.5-turbo to generate conversations and label utterances in the conversations.