Will KoboldCpp consider adding RAG functionality in the future? #1239

addisjeams · 2024-11-26T11:31:32Z

Will KoboldCpp consider adding RAG functionality in the future?

Hello, I am using KoboldCpp for creating long stories.
Now I have found a small issue, which requires manually updating the key in the world info, which is a very large workload. Because as the story unfolds, the original content corresponding to the key has become outdated.

May I ask if we are considering introducing RAG or opening up an interface that allows us to inject the content of other RAGs into the context of each submission?

Alternatively, if you have any good ideas, please let me know. Thank you.

I am currently using 32K-128K for context, but I still feel it is not enough.

addisjeams · 2024-11-28T01:54:38Z

I hope KoboldCpp can do better and become the best in the industry.

At present, we hope to:
1: KoboldCpp lacks RAG, but I think integrating open-source LightRAG or LazyGraphRAG would be great.
(This is what I urgently hope for, such as deep integration, because installing RAG means I have to give up KoboldCpp to run AI models)

2 : (I think this feature is feasible, but I don't need it.) KoboldCpp should be like LMStudio, allowing direct downloads and configurations from several AI model websites, enabling one-click download and running of several classic models. This is for beginners to use. Believe me, this is beneficial for the development of software, even though I don't need it at all.

LostRuins · 2024-11-28T11:11:43Z

Certainly a possibility, though the desired way to add it would be as a text-search within the UI instead of requiring an embedding model.

addisjeams · 2024-11-29T03:20:21Z

I didn't express myself clearly.

I am hoping for a larger context, currently set at 128K. Executing a 128K context would come with a considerable cost.

I aspire to enjoy a commendable experience while minimizing expenses, considering I am an individual, not a corporation.
Therefore, I would appreciate your feedback on whether my idea is feasible.

When a user submits a request, koboldcpp forwards the full text to the rag service, conducts one or multiple queries, and subsequently returns the results to our koboldcpp based on the provided key, possibly even automatically appending the key. Naturally, this addition cannot be unlimited; there must be a defined limit.

For instance, I enjoy crafting stories with a context length ranging from 32K to 64K. As the narrative unfolds, the previously generated keys require updating, a task that is both tedious and prone to errors.

If successfully implemented, I believe even a 24K context could yield results comparable to those achieved with a 128K context, potentially even surpassing the performance of a 1M context.

addisjeams · 2024-11-29T03:28:58Z

The feedback from RAG is typically delayed, so there's no need to wait for a submission to trigger its execution. It can be scheduled or executed upon detecting data increments, allowing for flexible configurations.
Additionally, the query results of RAG don't have to be immediate; it can comfortably tolerate a slight delay.
Naturally, the decision to wait or not to wait rests with you.

What are your thoughts? I believe this koboldcpp is simply the epitome of AI reasoning mode.

Microsoft's Graphrag within RAG is highly capable, yet it's excessively slow... I sincerely hope you can integrate the speedy LightRAG or LazyGraphRAG.

Microsoft has mentioned that LazyGraphRAG will be integrated into Graphrag. I find integrating LightRAG or directly integrating Graphrag to be truly remarkable.

Consider this, this koboldcpp is unparalleled in writing documents, stories, and bidding proposals. Furthermore, it has potential applications for both individuals and businesses.

I find it hard to believe how incredibly useful and remarkable this tool is.

addisjeams · 2024-11-29T03:31:34Z

Thank you for providing such a user-friendly integration tool, koboldcpp. It offers a lot of flexibility and allows for the setting of numerous parameters.
Excellent!

addisjeams · 2024-11-29T03:35:35Z

I realize I didn't express one point clearly, sorry.

What I mean is that there is no need for koboldcpp to implement what RAG has already implemented. Instead, it should directly integrate RAG as a service or call it as a component.

LostRuins · 2024-11-29T11:36:05Z

If you have a service in mind, sure I can take a look. It sounds like this would probably be for KoboldAI Lite side rather than koboldcpp

addisjeams · 2024-12-01T04:26:06Z

If you have a service in mind, sure I can take a look. It sounds like this would probably be for KoboldAI Lite side rather than koboldcpp

RecurrentGPT is not a GPT, but rather a story creation architecture that is excellent for creating stories. Currently, I am manually using it to create stories. If there were a more user-friendly and intelligent development framework, wouldn't that be really cool? Integrating frameworks like RAG (Retrieval-Augmented Generation) could also introduce keys to make the story creation with RecurrentGPT more scientific.

Currently, 30B sized AI model, usable but not smart enough。 while a 70B model is very effective.

It's not just about integrating it into KoboldAI; it could also be a new GitHub project. However, I would prefer to add it to KoboldAI. Personally, I think it could even become a commercial product, as many people in China are willing to pay for it. In China, the competition in web novel writing is fierce, and many people are willing to spend $500 to learn how to use AI for writing, or even more money to update their equipment for running local AI or renting online GPT services. The cost of using commercial AI for writing novels in China is not cheap. Therefore, writers tend to buy commercial products to run AI locally and are willing to pay for them. There are currently quite a few online adventure role-playing games in China, but most of them are free and online. There are very few practical local products, which presents a business opportunity. Online platforms are heavily censored and do not allow NSFW content. Therefore, a local application is a very appealing option.

delebash · 2024-12-14T04:42:46Z

I vote for rag and web search. koboldcpp with lite ui is really so easy to get working even the tts stt. Currently I am testing Silly Tavern because of the rag and web search but your ui has most of the features I am looking for such as type, character cards, world info. Thanks for making koboldcpp and the new ui. Cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will KoboldCpp consider adding RAG functionality in the future? #1239

Will KoboldCpp consider adding RAG functionality in the future? #1239

addisjeams commented Nov 26, 2024

addisjeams commented Nov 28, 2024 •

edited

Loading

LostRuins commented Nov 28, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

LostRuins commented Nov 29, 2024

addisjeams commented Dec 1, 2024 •

edited

Loading

delebash commented Dec 14, 2024

Will KoboldCpp consider adding RAG functionality in the future? #1239

Will KoboldCpp consider adding RAG functionality in the future? #1239

Comments

addisjeams commented Nov 26, 2024

addisjeams commented Nov 28, 2024 • edited Loading

LostRuins commented Nov 28, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

addisjeams commented Nov 29, 2024

LostRuins commented Nov 29, 2024

addisjeams commented Dec 1, 2024 • edited Loading

delebash commented Dec 14, 2024

addisjeams commented Nov 28, 2024 •

edited

Loading

addisjeams commented Dec 1, 2024 •

edited

Loading