This dataset consists of Covid-19 related tweets posted by users coming from six English-speaking countries: Australia, Canada, Ireland, New Zealand, the United Kingdom, and the United States. The goal is to create an NLP model to classify them by country and allocate topics with Latent Dirichlet Allocation techniques.
Model Implemented: Ensemble Methods (CNN & Naïve Bayes Model) with 50% Accuracy. Improvement from Logitic Regression (45.2% Acc), Linear SVC (48.7% Acc) & Multinomial Naive Bayes (49.4% Acc)
Deliverables: Report, Topic Visualization Dashboard & Jupyter Notebook