Author: UTKARSH KUMAR

Please refer to "requirements.txt" for information about required libraries for smooth running of the code.
Run "corpusProcess.py" first to generate corpus files.
The default courpus is "wiki_56" but a new corpus or a list of corpus can be given as a command line argument. when running "corpusProcess.py".
To test queries, please provide your query in "query.txt" and run "test_queries.py".
By default the "test_queries.py" take the files generated by "corpusProcess.py".
"test_queries.py" also accpet command line arguments with file name ordered as :
1. Query file
2. Index file
3. Bigram_Index file
4. Document IDs file
Please give 5-10 minutes to each script to preprocess and perform file i/o and construct required Data structures.

If you want to explore and experiment how the model performs with other corpus find some corpus files here at: https://drive.google.com/drive/folders/1ZsnuEm7_N6aUwhjFpv-TZXFt4DiYex4t?usp=sharing

Provide feedback

Saved searches