-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will KoboldCpp consider adding RAG functionality in the future? #1239
Comments
I hope KoboldCpp can do better and become the best in the industry. At present, we hope to: 2 : (I think this feature is feasible, but I don't need it.) KoboldCpp should be like LMStudio, allowing direct downloads and configurations from several AI model websites, enabling one-click download and running of several classic models. This is for beginners to use. Believe me, this is beneficial for the development of software, even though I don't need it at all. |
|
I didn't express myself clearly. I am hoping for a larger context, currently set at 128K. Executing a 128K context would come with a considerable cost. I aspire to enjoy a commendable experience while minimizing expenses, considering I am an individual, not a corporation. When a user submits a request, koboldcpp forwards the full text to the rag service, conducts one or multiple queries, and subsequently returns the results to our koboldcpp based on the provided key, possibly even automatically appending the key. Naturally, this addition cannot be unlimited; there must be a defined limit. For instance, I enjoy crafting stories with a context length ranging from 32K to 64K. As the narrative unfolds, the previously generated keys require updating, a task that is both tedious and prone to errors. If successfully implemented, I believe even a 24K context could yield results comparable to those achieved with a 128K context, potentially even surpassing the performance of a 1M context. |
The feedback from RAG is typically delayed, so there's no need to wait for a submission to trigger its execution. It can be scheduled or executed upon detecting data increments, allowing for flexible configurations. What are your thoughts? I believe this koboldcpp is simply the epitome of AI reasoning mode. Microsoft's Graphrag within RAG is highly capable, yet it's excessively slow... I sincerely hope you can integrate the speedy LightRAG or LazyGraphRAG. Microsoft has mentioned that LazyGraphRAG will be integrated into Graphrag. I find integrating LightRAG or directly integrating Graphrag to be truly remarkable. Consider this, this koboldcpp is unparalleled in writing documents, stories, and bidding proposals. Furthermore, it has potential applications for both individuals and businesses. I find it hard to believe how incredibly useful and remarkable this tool is. |
Thank you for providing such a user-friendly integration tool, koboldcpp. It offers a lot of flexibility and allows for the setting of numerous parameters. |
I realize I didn't express one point clearly, sorry. What I mean is that there is no need for koboldcpp to implement what RAG has already implemented. Instead, it should directly integrate RAG as a service or call it as a component. |
If you have a service in mind, sure I can take a look. It sounds like this would probably be for KoboldAI Lite side rather than koboldcpp |
RecurrentGPT is not a GPT, but rather a story creation architecture that is excellent for creating stories. Currently, I am manually using it to create stories. If there were a more user-friendly and intelligent development framework, wouldn't that be really cool? Integrating frameworks like RAG (Retrieval-Augmented Generation) could also introduce keys to make the story creation with RecurrentGPT more scientific. |
I vote for rag and web search. koboldcpp with lite ui is really so easy to get working even the tts stt. Currently I am testing Silly Tavern because of the rag and web search but your ui has most of the features I am looking for such as type, character cards, world info. Thanks for making koboldcpp and the new ui. Cheers! |
Will KoboldCpp consider adding RAG functionality in the future?
Hello, I am using KoboldCpp for creating long stories.
Now I have found a small issue, which requires manually updating the key in the world info, which is a very large workload. Because as the story unfolds, the original content corresponding to the key has become outdated.
May I ask if we are considering introducing RAG or opening up an interface that allows us to inject the content of other RAGs into the context of each submission?
Alternatively, if you have any good ideas, please let me know. Thank you.
I am currently using 32K-128K for context, but I still feel it is not enough.
The text was updated successfully, but these errors were encountered: