Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will KoboldCpp consider adding RAG functionality in the future? #1239

Open
addisjeams opened this issue Nov 26, 2024 · 9 comments
Open

Will KoboldCpp consider adding RAG functionality in the future? #1239

addisjeams opened this issue Nov 26, 2024 · 9 comments

Comments

@addisjeams
Copy link

Will KoboldCpp consider adding RAG functionality in the future?

Hello, I am using KoboldCpp for creating long stories.
Now I have found a small issue, which requires manually updating the key in the world info, which is a very large workload. Because as the story unfolds, the original content corresponding to the key has become outdated.

May I ask if we are considering introducing RAG or opening up an interface that allows us to inject the content of other RAGs into the context of each submission?

Alternatively, if you have any good ideas, please let me know. Thank you.

I am currently using 32K-128K for context, but I still feel it is not enough.

@addisjeams
Copy link
Author

addisjeams commented Nov 28, 2024

I hope KoboldCpp can do better and become the best in the industry.

At present, we hope to:
1: KoboldCpp lacks RAG, but I think integrating open-source LightRAG or LazyGraphRAG would be great.
(This is what I urgently hope for, such as deep integration, because installing RAG means I have to give up KoboldCpp to run AI models)

2 : (I think this feature is feasible, but I don't need it.) KoboldCpp should be like LMStudio, allowing direct downloads and configurations from several AI model websites, enabling one-click download and running of several classic models. This is for beginners to use. Believe me, this is beneficial for the development of software, even though I don't need it at all.

@LostRuins
Copy link
Owner

  1. Certainly a possibility, though the desired way to add it would be as a text-search within the UI instead of requiring an embedding model.

@addisjeams
Copy link
Author

I didn't express myself clearly.

I am hoping for a larger context, currently set at 128K. Executing a 128K context would come with a considerable cost.

I aspire to enjoy a commendable experience while minimizing expenses, considering I am an individual, not a corporation.
Therefore, I would appreciate your feedback on whether my idea is feasible.

When a user submits a request, koboldcpp forwards the full text to the rag service, conducts one or multiple queries, and subsequently returns the results to our koboldcpp based on the provided key, possibly even automatically appending the key. Naturally, this addition cannot be unlimited; there must be a defined limit.

For instance, I enjoy crafting stories with a context length ranging from 32K to 64K. As the narrative unfolds, the previously generated keys require updating, a task that is both tedious and prone to errors.

If successfully implemented, I believe even a 24K context could yield results comparable to those achieved with a 128K context, potentially even surpassing the performance of a 1M context.

@addisjeams
Copy link
Author

The feedback from RAG is typically delayed, so there's no need to wait for a submission to trigger its execution. It can be scheduled or executed upon detecting data increments, allowing for flexible configurations.
Additionally, the query results of RAG don't have to be immediate; it can comfortably tolerate a slight delay.
Naturally, the decision to wait or not to wait rests with you.

What are your thoughts? I believe this koboldcpp is simply the epitome of AI reasoning mode.

Microsoft's Graphrag within RAG is highly capable, yet it's excessively slow... I sincerely hope you can integrate the speedy LightRAG or LazyGraphRAG.

Microsoft has mentioned that LazyGraphRAG will be integrated into Graphrag. I find integrating LightRAG or directly integrating Graphrag to be truly remarkable.

Consider this, this koboldcpp is unparalleled in writing documents, stories, and bidding proposals. Furthermore, it has potential applications for both individuals and businesses.

I find it hard to believe how incredibly useful and remarkable this tool is.

@addisjeams
Copy link
Author

Thank you for providing such a user-friendly integration tool, koboldcpp. It offers a lot of flexibility and allows for the setting of numerous parameters.
Excellent!

@addisjeams
Copy link
Author

I realize I didn't express one point clearly, sorry.

What I mean is that there is no need for koboldcpp to implement what RAG has already implemented. Instead, it should directly integrate RAG as a service or call it as a component.

@LostRuins
Copy link
Owner

If you have a service in mind, sure I can take a look. It sounds like this would probably be for KoboldAI Lite side rather than koboldcpp

@addisjeams
Copy link
Author

addisjeams commented Dec 1, 2024

If you have a service in mind, sure I can take a look. It sounds like this would probably be for KoboldAI Lite side rather than koboldcpp

RecurrentGPT is not a GPT, but rather a story creation architecture that is excellent for creating stories. Currently, I am manually using it to create stories. If there were a more user-friendly and intelligent development framework, wouldn't that be really cool? Integrating frameworks like RAG (Retrieval-Augmented Generation) could also introduce keys to make the story creation with RecurrentGPT more scientific.

Currently, 30B sized AI model, usable but not smart enough。 while a 70B model is very effective.

It's not just about integrating it into KoboldAI; it could also be a new GitHub project. However, I would prefer to add it to KoboldAI. Personally, I think it could even become a commercial product, as many people in China are willing to pay for it. In China, the competition in web novel writing is fierce, and many people are willing to spend $500 to learn how to use AI for writing, or even more money to update their equipment for running local AI or renting online GPT services. The cost of using commercial AI for writing novels in China is not cheap. Therefore, writers tend to buy commercial products to run AI locally and are willing to pay for them. There are currently quite a few online adventure role-playing games in China, but most of them are free and online. There are very few practical local products, which presents a business opportunity. Online platforms are heavily censored and do not allow NSFW content. Therefore, a local application is a very appealing option.

@delebash
Copy link

I vote for rag and web search. koboldcpp with lite ui is really so easy to get working even the tts stt. Currently I am testing Silly Tavern because of the rag and web search but your ui has most of the features I am looking for such as type, character cards, world info. Thanks for making koboldcpp and the new ui. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants