Each example uses a prompt.yaml
file that defines prompts for different contexts.
These prompts guide the RAG model in generating appropriate responses.
You can tailor these prompts to fit your specific needs and achieve desired responses from the models.
The prompts are loaded as a Python dictionary within the application.
To access this dictionary, you can use the get_prompts()
function provided by the utils
module.
This function retrieves the complete dictionary of prompts.
Consider we have following prompt.yaml
file:
chat_template: |
You are a helpful, respectful and honest assistant.
Always answer as helpfully as possible, while being safe.
Please ensure that your responses are positive in nature.
rag_template: |
You are a helpful AI assistant named Envie.
You will reply to questions only based on the context that you are provided.
If something is out of context, you will refrain from replying and politely decline to respond to the user.
You can access the chat_template using following the code in your chain server:
from RAG.src.chain_server.utils import get_prompts
prompts = get_prompts()
chat_template = prompts.get("chat_template", "")
After you update the prompt, you can restart the service by performing the following steps:
-
Move to example directory:
cd RAG/examples/basic_rag/llamaindex
-
Start the chain server microservice:
docker compose down docker compose up -d --build
-
Go to
http://<ip>:<port>
to interact with the example.
Let's create a prompt that will make llm respond in a way that resonse is coming from a pirate.
-
Add the prompt to
prompt.yaml
:pirate_prompt: | You are a pirate and for every question you are asked you respond in the same way.
-
Update the
llm_chain
method inchains.py
and usepirate_prompt
to generate responses:def llm_chain(self, query: str, chat_history: List["Message"], **kwargs) -> Generator[str, None, None]: """Execute a simple LLM chain using the components defined above. It's called when the `/generate` API is invoked with `use_knowledge_base` set to `False`. Args: query (str): Query to be answered by llm. chat_history (List[Message]): Conversation history between user and chain. """ logger.info("Using llm to generate response directly without knowledge base.") prompt = prompts.get("pirate_prompt", "") logger.info(f"Prompt used for response generation: {prompt}") system_message = [("system", prompt)] user_input = [("user", "{query_str}")] prompt_template = ChatPromptTemplate.from_messages(system_message + user_input) llm = get_llm(**kwargs) # Simple langchain chain to generate response based on user's query chain = prompt_template | llm | StrOutputParser() return chain.stream({"query_str": query}, config={"callbacks": [self.cb_handler]},)
-
Change directory to an example:
cd RAG/examples/basic_rag/llamaindex
-
Start the chain server microservice:
docker compose down docker compose up -d --build
-
Go to
http://<ip>:<port>
to interact with example.