HamHD

HamHD: Spam Text Detection using Hyperdimensional Computing

Instructions

The SMS ham/spam dataset can be obtained from Kaggle at: https://www.kaggle.com/uciml/sms-spam-collection-dataset
The Youtube comment ham/spam dataset can be obtained from Kaggle at: https://www.kaggle.com/lakshmi25npathi/images
- You can train / test on different subsets individually, or concatenate all the csvs to build one large dataset.
In the python script, change the f_ variable to match with the dataset csv path you downloaded.

This application runs udner Python 3. Please have the newest python version installed.
This application requires the following packages: pandas, numpy, scikit-learn and matplotlib (if you would like to output figures). Packages can be installed via:

pip install pandas numpy scikit-learn matplotlib

Application can be run by directly executing the python script, e.g. "python3 HamHD-text.py"
For details about the script (parameters, encoding schemes etc) please check the comments inside HamHD-text.py.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
HamHD-text.py		HamHD-text.py
HamHD-ytb.py		HamHD-ytb.py
README.md		README.md