Binary classification experiments for the Twitter dataset
Notebook | Link to jupyter nbviewer | Link to Colab |
---|---|---|
BiRNN_LSTM_GRU-BestModel.ipynb | ||
BiRNN_LSTM_GRU-Experiments.ipynb | ||
FeedForwardNN_GloVe.ipynb | ||
FeedForwardNN_TfiDf.ipynb | ||
LogisticRegression.ipynb |
Developed a sentiment classifier using logistic regression for the Twitter sentiment classification dataset available in this link. I used the toolkit Scikit-Learn again.
Tf-Idf vectorization of the tweets. No pretrained vectors
Model metrics for evaluation: F1 score, Recall and Precision
Visualization: Confusion matrices
Developed two sentiment classifier using feed-forward neural (pyTorch) networks for the Twitter sentiment analysis dataset.
Experimented with:
- the number of hidden layers, and the number of their units
- the activation functions (only the ones presented in the lectures)
- the loss function
- the optimizer, etc
Tf-Idf vectorization of the tweets. No pretrained vectors
Vectorization made with GloVe (Stanford pre-trained embenddings)
Model metrics for evaluation: F1 score, Recall and Precision
Visualization: ROC curves, Loss vs Epochs, Accuracy vs Epochs and Confusion matrix
Experimented with:
- the number of stacked RNNs,
- the number of hidden layers,
- type of cells,
- skip connections,
- gradient clipping and
- dropout probability
Used the Adam optimizer and the binary cross-entropy loss function and transformed the predicted logits to probabilities using a sigmoid function.
Pre-trained word embeddings (GloVe) as the initial embeddings to input on models.
Model metrics for evaluation: F1 score, Recall and Precision
Visualization: ROC curves, Loss vs Epochs, Accuracy vs Epochs and Confusion matrix
Β© Konstantinos Nikoletos | 2020 - 2021