Skip to content

Implementing a hashing technique to compare large scale, out of core machine learning datasets

License

Notifications You must be signed in to change notification settings

rahulbshrestha/hash-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hash-dataset

Implementing a hashing technique to compare large scale, out of core machine learning datasets

Instructions

To compare two datasets:

python3 src/dataset-hash.py data/small-dataset-1 data/small-dataset-2

Available options:

python3 src/hash-dataset.py [OPTIONS] [FILES]

-m : Display samples that are matching
-n : Display samples that are not matching

About

Implementing a hashing technique to compare large scale, out of core machine learning datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages