[Tech Question] Query biasing problem #502
Replies: 3 comments 1 reply
-
Hi Gangxin, please correct me if my understanding is wrong, but it seems like you want Khoj to be able to respond to some questions without referencing your personal knowledge base? If so, the two techniques Khoj currently uses to answer general questions are:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for your response!!!
I want to understand how Khoj distinguishes whether questions are related to my personal knowledge or not. In more detail, we use keyword-based methods or TF-IDF to determine if new questions are relevant to our knowledge. Against this backdrop, I've built a binary classification system to detect whether a new question is related or not. But I am not sure whether my binary classification could beat the original LLM recognition. Has LLM already implemented this well? I'm uncertain about how to specify the fields that LLM identifies. I have checked several materials about this one, but nothing valuable found. |
Beta Was this translation helpful? Give feedback.
-
Very tempted to say that having bigger corpus won't hurt. |
Beta Was this translation helpful? Give feedback.
-
Hi Ai leaders,
I am wondering how you address the issue of query bias.
From what I understand, we are trying to build a word embedding to acquire new knowledge. However, when we submit a query, it is directly fed into the word embedding. How do we determine the contents of the fields? And how do you enhance the accuracy of the results?
For instance, if I input "history" knowledge into the model, but only want to ask a general question rather than a history-specific one, it seems there is no mechanism in place to manage this distinction. My proposal is to develop a binary text classification system to handle the query, and give it a strong instructions when we promote a query.
What are your thoughts on this? Or dose default LLM has handle that?
Many thanks,
Gangxin
Beta Was this translation helpful? Give feedback.
All reactions