Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big question, BERT ranking need triple to train, but... #7

Closed
guotong1988 opened this issue Aug 7, 2019 · 4 comments
Closed

Big question, BERT ranking need triple to train, but... #7

guotong1988 opened this issue Aug 7, 2019 · 4 comments

Comments

@guotong1988
Copy link
Contributor

guotong1988 commented Aug 7, 2019

https://msmarco.blob.core.windows.net/msmarcoranking/collectionandqueries.tar.gz
The data above do not contains the negative doc.

@rodrigonogueira4 Thank you!!!

The data is from readme here: https://github.com/nyu-dl/dl4ir-doc2query#ms-marco
image

@rodrigonogueira4
Copy link
Collaborator

To train the se2seq model you only need pairs of queries and relevant documents, hence you don't need negatives.

Please post doc2query-related questions in that repository:
https://github.com/nyu-dl/dl4ir-doc2query

@guotong1988
Copy link
Contributor Author

But BERT ranking step need the negative doc.

@rodrigonogueira4
Copy link
Collaborator

Please note that this repository only contains the code to train doc2query (seq2seq) model. If you want to train BERT re-ranker, please follow the steps in https://github.com/nyu-dl/dl4marco-bert.

Also, please note that training BERT re-ranker on the expanded documents did not give better results than training on the non-expanded (original) documents. I.e., you can use the trained model from https://github.com/nyu-dl/dl4marco-bert to re-rank the expanded documents.

@guotong1988
Copy link
Contributor Author

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants