Skip to content

NLP model to classify Tweets by country and allocate topics with Latent Dirichlet Allocation techniques.

Notifications You must be signed in to change notification settings

mnovovil/CountryClassificationOfCovidTweetsNLP

Repository files navigation

COVID-19 Tweets Country of Origin Classification - NLP Project

This dataset consists of Covid-19 related tweets posted by users coming from six English-speaking countries: Australia, Canada, Ireland, New Zealand, the United Kingdom, and the United States. The goal is to create an NLP model to classify them by country and allocate topics with Latent Dirichlet Allocation techniques.

Model Implemented: Ensemble Methods (CNN & Naïve Bayes Model) with 50% Accuracy. Improvement from Logitic Regression (45.2% Acc), Linear SVC (48.7% Acc) & Multinomial Naive Bayes (49.4% Acc)

Deliverables: Report, Topic Visualization Dashboard & Jupyter Notebook

Dashboard: alt text

About

NLP model to classify Tweets by country and allocate topics with Latent Dirichlet Allocation techniques.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published