[QUESTION] What is the retrieval datasets when evaluating downstream tasks? #1172
Unanswered
ZihaoLin0123
asked this question in
Q&A
Replies: 1 comment
-
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I read the InstructRetro paper and it said
Does this mean that for each QA dataset, you only use its corpus as retrieval data? For example, when you evaluate NewsQA, you first collect all the articles of NewsQA without considering other tasks' corpus such as SQuAD and Wikipedia, chunk them, and embed them. Then you input a question, and the retriever will retrieve related article chunks from the corpus. Finally, concatenate the retrieved chunks with the input and feed them into LLM. Is that correct?
Beta Was this translation helpful? Give feedback.
All reactions