This repository provides the codes and datasets used in the thesis: Training Daily Activity Classifiers for Healthcare with Semantic Similarity.
-
codes: the codes for running the models to generate predictions results.
-
datasets: the data used for buiding BOW systems and training BERT-based systems and the predictions of each systems of the project.
-
Prompt engineering: the pipeline of using GPT-3.5-turbo to generate conversations and label utterances in the conversations.