updated query processing llm and ollama documentation

openml-labs · Jul 22, 2024 · 5398f9e · 5398f9e
1 parent 3171f9e
commit 5398f9e
Show file tree

Hide file tree

Showing 12 changed files with 43 additions and 25 deletions.
diff --git a/docs/Inference/readme.md → docs/Inference/index.md b/docs/Inference/readme.md → docs/Inference/index.md
diff --git a/docs/Ollama server/index.md b/docs/Ollama server/index.md
@@ -0,0 +1,4 @@
+# Ollama Server
+
+- This is the server that runs an Ollama server (This is basically an optimized version of a local LLM. It does not do anything of itself but runs as a background service so you can use the LLM). 
+- You can start it by running `cd ollama && ./get_ollama.sh &`
diff --git a/docs/Query processing LLM/api_reference.md b/docs/Query processing LLM/api_reference.md
@@ -0,0 +1,3 @@
+:::llm_service
+
+:::llm_service_utils
diff --git a/docs/Query processing LLM/index.md b/docs/Query processing LLM/index.md
@@ -0,0 +1,27 @@
+# LLM Query parsing
+
+- This page is only an overview. Please refer to the api reference for more detailed information.
+- The query parsing LLM reads the query and parses it into a list of filters based on a prompt. The expected result is a JSON with a list of filters to be applied to the metadata and the query.
+- This is done by providing a prompt to the RAG and telling it to extract the filters/etc and either structure it or not.
+- This implementation is served as a FastAPI service that can be queried quite easily.
+
+## Unstructured Implementation
+- This implementation is independent of `langchain`, and takes a more manual approach to parsing the filters. At the moment, this does not separate the query from the filters either. (The structured query implementation attempts to do that.)
+- The response of the the LLM parser does not take into account how to apply the filters, it just provides a list of the ones that the LLM considered relevant to the UI.
+- This component is the one that runs the query processing using LLMs module. It uses the Ollama server, runs queries and processes them. 
+- You can start it by running `cd llm_service && uvicorn llm_service:app --host 0.0.0.0 --port 8081 &`
+- Curl Example : `curl http://0.0.0.0:8081/llmquery/find%20me%20a%20mushroom%20dataset%20with%20less%20than%203000%20classes`
+
+### llm_service.py
+- A prompt template is used to tell the RAG what to do. 
+- The prompt_dict defines a list of filters and their respective prompts for the LLM. This is concatenated with the prompt template.
+- The response is parsed quite simply. Since the LLM is asked to provide it's answers line by line, each line is parsed for the required information according to a list of patterns provided. 
+- Thus, if you want to add a new type of answer, add it to the patterns list and it should be taken care of.
+
+### llm_service_utils.py
+- The main logic of the above is defined here.
+
+## Structured Query Implementation
+
+## Additional information
+- In the process of testing this implementation, a blog was written about how the temperature parameter affects the results of the model. This can be [found here](https://openml-labs.github.io/blog/posts/Experiments-with-temperature/experiments_with_temp.html).
diff --git a/docs/Query processing LLM/query_llm.md b/docs/Query processing LLM/query_llm.md
diff --git a/docs/Query processing LLM/readme.md b/docs/Query processing LLM/readme.md
diff --git a/docs/Rag Pipeline/Developer Tutorials/README.md b/docs/Rag Pipeline/Developer Tutorials/README.md
@@ -2,7 +2,7 @@
 
 - Hello there, future OpenML contributor! It is nice meeting you here. This page is a collection of tutorials that will help you get started with contributing to the OpenML RAG pipeline.
 - The tutorials show you how to perform common tasks and should make it a lot easier to get started with contributing to this project.
-- Note that you would have had to setup the project before you begin. If you missed this step, please refer to [](../../readme.md)
+- Note that you would have had to setup the project before you begin. If you missed this step, please refer to [](../../index)
 
 ## How to use them
 - Once you have setup the project, just navigate to the tutorial you are interested in and open them in your IDE.
diff --git a/docs/Rag Pipeline/readme.md → docs/Rag Pipeline/index.md b/docs/Rag Pipeline/readme.md → docs/Rag Pipeline/index.md
diff --git a/docs/UI/frontend.md b/docs/UI/frontend.md
@@ -2,6 +2,8 @@
 - This page is only an overview. Please refer to the api reference for more detailed information.
 - Currently the frontend is based on Streamlit. The hope is to integrate it with the OpenML website in the future.
 - This is what it looks like at the moment : ![](../images/search_ui.png)
+- This component runs the Streamlit frontend. It is the UI that you see when you navigate to `http://localhost:8501`.
+- You can start it by running `cd frontend && streamlit run ui.py &`
 
 ## Design Methodology
 - The main point to note here is that the UI is responsible for all the post-processing of the results, including the displayed metadata information etc. 

diff --git a/docs/readme.md → docs/index.md b/docs/readme.md → docs/index.md
diff --git a/llm_service/llm_service.py b/llm_service/llm_service.py
@@ -35,9 +35,7 @@
 @retry(stop=stop_after_attempt(3), retry=retry_if_exception_type(ConnectTimeout))
 async def get_llm_query(query: str):
     """
-    Description: Get the query, replace %20 with space and invoke the chain to get the answers based on the prompt
-
-
+    Description: Get the query, replace %20 (url spacing) with space and invoke the chain to get the answers based on the prompt
     """
     query = query.replace("%20", " ")
     response = chain.invoke({"query": query})

diff --git a/llm_service/llm_service_utils.py b/llm_service/llm_service_utils.py
@@ -4,27 +4,24 @@
 from langchain_core.prompts import ChatPromptTemplate
 
 
-def create_chain(prompt, model="llama3", temperature=0):
+def create_chain(prompt, model: str = "llama3", temperature: int = 0):
     """
-    Description: Create a chain with the given prompt and model
-
-
+    Description: Create a langchain chain with the given prompt and model and the temperature.
+    The lower the temperature, the less "creative" the model will be.
     """
     llm = ChatOllama(model=model, temperature=temperature)
     prompt = ChatPromptTemplate.from_template(prompt)
 
     return prompt | llm | StrOutputParser()
 
 
-def parse_answers_initial(response, patterns, prompt_dict):
+def parse_answers_initial(response: str, patterns: list, prompt_dict: dict) -> dict:
     """
     Description: Parse the answers from the initial response
-
-
+    - if the response contains a ? and a new line then join the next line with it (sometimes the LLM adds a new line after the ? instead of just printing it on the same line)
     """
 
     answers = []
-    # if the response contains a ? and a new line then join the next line with it (sometimes the LLM adds a new line after the ? instead of just printing it on the same line)
     response = response.replace("?\n", "?")
 
     # convert the response to lowercase and split it into lines