Skip to content

Latest commit

 

History

History
41 lines (33 loc) · 1.16 KB

readme.md

File metadata and controls

41 lines (33 loc) · 1.16 KB

Author: UTKARSH KUMAR


  1. Please refer to "requirements.txt" for information about required libraries for smooth running of the code.

  2. Run "corpusProcess.py" first to generate corpus files.

  3. The default courpus is "wiki_56" but a new corpus or a list of corpus can be given as a command line argument. when running "corpusProcess.py".

  4. To test queries, please provide your query in "query.txt" and run "test_queries.py".

  5. By default the "test_queries.py" take the files generated by "corpusProcess.py".

  6. "test_queries.py" also accpet command line arguments with file name ordered as :

    1. Query file
    2. Index file
    3. Bigram_Index file
    4. Document IDs file
  7. Please give 5-10 minutes to each script to preprocess and perform file i/o and construct required Data structures.


If you want to explore and experiment how the model performs with other corpus find some corpus files here at: https://drive.google.com/drive/folders/1ZsnuEm7_N6aUwhjFpv-TZXFt4DiYex4t?usp=sharing