You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am encountering limitations with RAGLite when evaluating using RAGAS. The main issue is that the RAG pipeline configuration differs between inference and evaluation, which impacts consistency and reproducibility.
The differences include:
1. System prompt
2. Instruction prompt (where the context is added to a user prompt)
3. Number of chunks to retrieve
4. Number of chunk spans to retrieve
Questions
Does it make sense to include these differences as part of a unified configuration to ensure alignment?
Or should we explore an alternative mechanism to address this inconsistency?
The text was updated successfully, but these errors were encountered:
For my understanding, are you concerned about a discrepancy between insert_evals and answer_evals (i.e., a difference between eval generation and inference), or between answer_evals and evaluate/Ragas (i.e., a difference between inference and evaluation of the inferences)?
Yes, but I am working in a PR to add all the shared parameters in RAGLiteConfig. Still WIP, waiting for your comments when ready.
These are the parameters I am adding:
search_method: Literal["hybrid", "vector", "keyword"] ="hybrid"system_prompt: str|None=Nonerag_instruction_template: str|None=RAG_INSTRUCTION_TEMPLATEnum_chunks: int=5chunk_neighbors: tuple[int, ...] |None= (-1, 1) # Neighbors to include in the context.
I am encountering limitations with RAGLite when evaluating using RAGAS. The main issue is that the RAG pipeline configuration differs between inference and evaluation, which impacts consistency and reproducibility.
The differences include:
1. System prompt
2. Instruction prompt (where the context is added to a user prompt)
3. Number of chunks to retrieve
4. Number of chunk spans to retrieve
Questions
The text was updated successfully, but these errors were encountered: