Replies: 1 comment 1 reply
-
One thing I can recommend is using promptflow and create an evaluation flow that can help you analyse how accurate your model is. Once you have a batch test data and some good evaluation flow with metrics you can start tuning your prompt and your entire flow. It is almost like Test Driven Development but for LLMs. You should also try to restrict what your model generates as much as possible by using some jinja2 template and possibly "few shots" technique as described here https://www.promptingguide.ai/techniques/fewshot Also it is a good idea to make your assistant always quote and cite the source (original pdf) where the info was found. To sum it up - the task is not trivial and the tools / techniques required will take you some time to learn, but the benefits are great so I encourage you to do it :) |
Beta Was this translation helpful? Give feedback.
-
Hello,
I am building a RAG with Llama-cpp-python and langchain LlamaCpp for a few hundred PDFs of scientific information with GPU support.
I am tryuing to have the AI to generate text but void any hallucination and remain factual while explaining diversely and well. I have tried optimizing the parameters of the LLM to my best knowledge based on information online.
I use Zephyr LLM and Langchain RetrievalQA. (Haven't had much luck with memory, so no memory) . I use FAISS for vectordb.
I was wondering how to avoid any hallucination, and if those parameters would seem appropriate for the intended purpose of interrogating a large set of data sticking to the facts?
Also any other recommandations? Such as on retrieval of references, or addition of other exterrnal sources such as online browsing a wiki.
Loading the model as such with parameters:
Parameters loaded:
PS: llama-cpp-python is great!
Beta Was this translation helpful? Give feedback.
All reactions