Skip to content

This project implements a chat (Vue) to ask questions to a local LLM/Ollama server, through Langchain/OpenAI (python) backend.

Notifications You must be signed in to change notification settings

davidgfolch/OpenAI-local-ollama-chat

Repository files navigation

OpenAI langchain Local Ollama chat

backend-build-lint-and-tests Backend coverage

Tech-stack: Vue3 -> Python (langchain/openai) -> Ollama

This project implements a local AI chat, with :

  • Response pretty printing: markdown to html & code highlight
    • Formatted response for code blocks (through ability prompt).
  • Backend chat history with persistence (file store)
    • Delete question/answer in file history.
    • Delete all history.
    • User history selector.
  • LLM response live streaming: chunked streaming
    • Stop current streaming response.
  • Langchain callbacks logging (with truncated text in logs)
  • File uploader
    • File uploaded to uploads folder as files.
    • Mention file in question to LLM by pressing @ and choose file from uploaded files assistant.
  • Full conversation history export:
    • generating a downloadable zip file
    • extracting code blocks into linked files in the README.md (organized each response in a different folder) Export example
  • Advanced prompts
    • Multiple requests parametrization:
      • The frontend allows to trigger several questions (sequentially) to the LLM. You only need to provide a {variable} in the question & set the variable values in a single line, f.ex.:

        Generate a full example code with {variable} in python.
        
        variable=Django, Flask, NumPy, Pandas, Matplotlib,Scikit-learn, Requests
        

Watch the Youtube demo

Ollama installation & setup (required)

See README_OLLAMA.md

Project run

Run Ollama and load model

sudo service ollama start

Run backend & frontend

See respective README.md docs: backend & frontend

References

Langchain

OpenAI

Ollama

TODO

  1. User persistence

    1. Options: model, history, question.
    2. Save generation time in history metadata.
  2. Allow multiple question separated by ------------------- f.ex.

  3. Collapse all responses.

  4. Prompt: @openai-local-ollama-chat Explicame el proyecto

     ServiceException: op not found, upload it first!
     RuntimeError: Error loading uploads/openai-local-ollama-chat
     IsADirectoryError: [Errno 21] Is a directory: 'uploads/openai-local-ollama-chat'
    
  5. K-shift error (see Known-issues):

    1. Continue doesn't generates K-shift error, checkout how.
    2. Option (front/back) to disable passing all history to LLM.

Known issues (todo)

Ollama throws "Deepseek2 does not support K-shift"

When context gets bigger and can't fit into VRAM (after several q/a f.ex.), Ollama throws the following error because Deepseek2 does not support K-shift

ollama[17132]: /go/src/github.com/ollama/ollama/llm/llama.cpp/src/llama.cpp:15110: Deepseek2 does not support K-shift
ollama[17132]: Could not attach to process.  If your uid matches the uid of the target
ollama[17132]: process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
ollama[17132]: again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ollama[17132]: ptrace: Inappropriate ioctl for device.
ollama[17132]: No stack.
ollama[17132]: The program is not being run.

About

This project implements a chat (Vue) to ask questions to a local LLM/Ollama server, through Langchain/OpenAI (python) backend.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published