Merge branch 'arc53:main' into main

arc53 · Oct 23, 2023 · 6f544f5 · 6f544f5
2 parents fe866b2 + 465c4af
commit 6f544f5
Show file tree

Hide file tree

Showing 11 changed files with 111 additions and 66 deletions.
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@
 </p>
 
 <p align="left">
-  <strong><a href="https://docsgpt.arc53.com/">DocsGPT</a></strong> is a cutting-edge open-source solution that streamlines the process of finding information in project documentation. With its integration of the powerful <strong>GPT</strong> models, developers can easily ask questions about a project and receive accurate answers.
+  <strong><a href="https://docsgpt.arc53.com/">DocsGPT</a></strong> is a cutting-edge open-source solution that streamlines the process of finding information in the project documentation. With its integration of the powerful <strong>GPT</strong> models, developers can easily ask questions about a project and receive accurate answers.
 
 Say goodbye to time-consuming manual searches, and let <strong><a href="https://docsgpt.arc53.com/">DocsGPT</a></strong> help you quickly find the information you need. Try it out and see how it revolutionizes your project documentation experience. Contribute to its development and be a part of the future of AI-powered assistance.
 </p>
@@ -21,61 +21,56 @@ Say goodbye to time-consuming manual searches, and let <strong><a href="https://
 
 </div>
 
-### Production Support / Help for companies: 
+### Production Support / Help for companies:
 
 We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.
+
 - [Book Demo 👋](https://airtable.com/appdeaL0F1qV8Bl2C/shrrJF1Ll7btCJRbP)
 - [Send Email ✉️](mailto:[email protected]?subject=DocsGPT%20support%2Fsolutions)
-  
+
 ### [🎉 Join the Hacktoberfest with DocsGPT and Earn a Free T-shirt! 🎉](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)
 
 ![video-example-of-docs-gpt](https://d3dg1063dc54p9.cloudfront.net/videos/demov3.gif)
 
-
 ## Roadmap
 
 You can find our roadmap [here](https://github.com/orgs/arc53/projects/2). Please don't hesitate to contribute or create issues, it helps us improve DocsGPT!
 
 ## Our Open-Source models optimized for DocsGPT:
 
-| Name              | Base Model | Requirements (or similar)                        |
-|-------------------|------------|----------------------------------------------------------|
-| [Docsgpt-7b-falcon](https://huggingface.co/Arc53/docsgpt-7b-falcon)  | Falcon-7b  |  1xA10G gpu   |
-| [Docsgpt-14b](https://huggingface.co/Arc53/docsgpt-14b)              | llama-2-14b    | 2xA10 gpu's   |
-| [Docsgpt-40b-falcon](https://huggingface.co/Arc53/docsgpt-40b-falcon)       | falcon-40b     | 8xA10G gpu's  |
-
+| Name                                                                  | Base Model  | Requirements (or similar) |
+| --------------------------------------------------------------------- | ----------- | ------------------------- |
+| [Docsgpt-7b-falcon](https://huggingface.co/Arc53/docsgpt-7b-falcon)   | Falcon-7b   | 1xA10G gpu                |
+| [Docsgpt-14b](https://huggingface.co/Arc53/docsgpt-14b)               | llama-2-14b | 2xA10 gpu's               |
+| [Docsgpt-40b-falcon](https://huggingface.co/Arc53/docsgpt-40b-falcon) | falcon-40b  | 8xA10G gpu's              |
 
 If you don't have enough resources to run it, you can use bitsnbytes to quantize.
 
-
 ## Features
 
 ![Group 9](https://user-images.githubusercontent.com/17906039/220427472-2644cff4-7666-46a5-819f-fc4a521f63c7.png)
 
-
 ## Useful links
 
- - 🔍🔥 [Live preview](https://docsgpt.arc53.com/)
-
- - 💬🎉[Join our Discord](https://discord.gg/n5BX8dh8rU)
-
- - 📚😎 [Guides](https://docs.docsgpt.co.uk/)
-
- - 👩‍💻👨‍💻 [Interested in contributing?](https://github.com/arc53/DocsGPT/blob/main/CONTRIBUTING.md)
+- 🔍🔥 [Live preview](https://docsgpt.arc53.com/)
 
- - 🗂️🚀 [How to use any other documentation](https://docs.docsgpt.co.uk/Guides/How-to-train-on-other-documentation)
+- 💬🎉[Join our Discord](https://discord.gg/n5BX8dh8rU)
 
- - 🏠🔐  [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.co.uk/Guides/How-to-use-different-LLM)
+- 📚😎 [Guides](https://docs.docsgpt.co.uk/)
 
+- 👩‍💻👨‍💻 [Interested in contributing?](https://github.com/arc53/DocsGPT/blob/main/CONTRIBUTING.md)
 
+- 🗂️🚀 [How to use any other documentation](https://docs.docsgpt.co.uk/Guides/How-to-train-on-other-documentation)
 
+- 🏠🔐 [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.co.uk/Guides/How-to-use-different-LLM)
 
 ## Project structure
+
 - Application - Flask app (main application).
 
 - Extensions - Chrome extension.
 
-- Scripts - Script that creates similarity search index for other libraries. 
+- Scripts - Script that creates similarity search index for other libraries.
 
 - Frontend - Frontend uses Vite and React.
 
@@ -92,30 +87,30 @@ It will install all the dependencies and allow you to download the local model o
 Otherwise, refer to this Guide:
 
 1. Download and open this repository with `git clone https://github.com/arc53/DocsGPT.git`
-2. Create a `.env` file in your root directory and set the env variable `OPENAI_API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys) and  `VITE_API_STREAMING` to true or false, depending on if you want streaming answers or not.
+2. Create a `.env` file in your root directory and set the env variable `OPENAI_API_KEY` with your [OpenAI API key](https://platform.openai.com/account/api-keys) and `VITE_API_STREAMING` to true or false, depending on whether you want streaming answers or not.
    It should look like this inside:
-   
+
    ```
    API_KEY=Yourkey
    VITE_API_STREAMING=true
    ```
+
    See optional environment variables in the [/.env-template](https://github.com/arc53/DocsGPT/blob/main/.env-template) and [/application/.env_sample](https://github.com/arc53/DocsGPT/blob/main/application/.env_sample) files.
+
 3. Run [./run-with-docker-compose.sh](https://github.com/arc53/DocsGPT/blob/main/run-with-docker-compose.sh).
 4. Navigate to http://localhost:5173/.
 
 To stop, just run `Ctrl + C`.
 
-
-
-
-
 ## Development environments
 
 ### Spin up mongo and redis
-For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo). 
+
+For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo).
 See file [docker-compose-dev.yaml](./docker-compose-dev.yaml).
 
 Run
+
 ```
 docker compose -f docker-compose-dev.yaml build
 docker compose -f docker-compose-dev.yaml up -d
@@ -131,53 +126,62 @@ Make sure you have Python 3.10 or 3.11 installed.
 (check out [`application/core/settings.py`](application/core/settings.py) if you want to see more config options.)
 
 2. (optional) Create a Python virtual environment:
-You can follow the [Python official documentation](https://docs.python.org/3/tutorial/venv.html) for virtual environments.
+   You can follow the [Python official documentation](https://docs.python.org/3/tutorial/venv.html) for virtual environments.
 
 a) On Mac OS and Linux
+
 ```commandline
 python -m venv venv
 . venv/bin/activate
 ```
+
 b) On Windows
+
 ```commandline
 python -m venv venv
  venv/Scripts/activate
 ```
 
 3. Change to the `application/` subdir by the command `cd application/` and install dependencies for the backend:
+
 ```commandline
 pip install -r requirements.txt
 ```
+
 4. Run the app using `flask run --host=0.0.0.0 --port=7091`.
 5. Start worker with `celery -A application.app.celery worker -l INFO`.
 
-### Start frontend 
+### Start frontend
 
 Make sure you have Node version 16 or higher.
 
 1. Navigate to the [/frontend](https://github.com/arc53/DocsGPT/tree/main/frontend) folder.
-2. Install required packages `husky` and `vite` (ignore if installed).
+2. Install the required packages `husky` and `vite` (ignore if already installed).
+
 ```commandline
 npm install husky -g
 npm install vite -g
 ```
+
 3. Install dependencies by running `npm install --include=dev`.
 4. Run the app using `npm run dev`.
 
-
 ## Contributing
-Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information about how to get involved. We welcome issues, questions, and pull requests. 
+
+Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information about how to get involved. We welcome issues, questions, and pull requests.
 
 ## Code Of Conduct
+
 We as members, contributors, and leaders, pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. Please refer to the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more information about contributing.
 
 ## Many Thanks To Our Contributors
 
-<a href="[https://github.com/arc53/DocsGPT/graphs/contributors](https://docsgpt.arc53.com/)">
-  <img src="https://contrib.rocks/image?repo=arc53/DocsGPT" />
+<a href="[https://github.com/arc53/DocsGPT/graphs/contributors](https://docsgpt.arc53.com/)" alt="View Contributors">
+  <img src="https://contrib.rocks/image?repo=arc53/DocsGPT" alt="Contributors" />
 </a>
 
 ## License
+
 The source code license is [MIT](https://opensource.org/license/mit/), as described in the [LICENSE](LICENSE) file.
 
 Built with [🦜️🔗 LangChain](https://github.com/hwchase17/langchain)
diff --git a/application/api/user/routes.py b/application/api/user/routes.py
@@ -84,6 +84,19 @@ def api_feedback():
     )
     return {"status": http.client.responses.get(response.status_code, "ok")}
 
+@user.route("/api/delete_by_ids", methods=["get"])
+def delete_by_ids():
+    """Delete by ID. These are the IDs in the vectorstore"""
+
+    ids = request.args.get("path")
+    if not ids:
+        return {"status": "error"}
+
+    if settings.VECTOR_STORE == "faiss":
+        result = vectors_collection.delete_index(ids=ids)
+        if result:
+            return {"status": "ok"}
+    return {"status": "error"}
 
 @user.route("/api/delete_old", methods=["get"])
 def delete_old():

diff --git a/application/vectorstore/faiss.py b/application/vectorstore/faiss.py
@@ -27,6 +27,9 @@ def add_texts(self, *args, **kwargs):
     def save_local(self, *args, **kwargs):
         return self.docsearch.save_local(*args, **kwargs)
 
+    def delete_index(self, *args, **kwargs):
+        return self.docsearch.delete(*args, **kwargs)
+
     def assert_embedding_dimensions(self, embeddings):
         """
         Check that the word embedding dimension of the docsearch index matches
@@ -40,5 +43,4 @@ def assert_embedding_dimensions(self, embeddings):
             docsearch_index_dimension = self.docsearch.index.d
             if word_embedding_dimension != docsearch_index_dimension:
                 raise ValueError(f"word_embedding_dimension ({word_embedding_dimension}) " +
-                                 f"!= docsearch_index_word_embedding_dimension ({docsearch_index_dimension})")
-
+                                 f"!= docsearch_index_word_embedding_dimension ({docsearch_index_dimension})")
diff --git a/docs/README.md b/docs/README.md
@@ -50,4 +50,4 @@ yarn dev
 
 - Now, you should be able to view the docs on your local environment by visiting `http://localhost:5000`. You can explore the different markdown files and make changes as you see fit.
 
-- Footnotes: This guide assumes you have Node.js and npm installed. The guide involves running a local server using yarn, and viewing the documentation offline. If you encounter any issues, it may be worth verifying your Node.js and npm installations and whether you have installed yarn correctly.
+- **Footnotes:** This guide assumes you have Node.js and npm installed. The guide involves running a local server using yarn, and viewing the documentation offline. If you encounter any issues, it may be worth verifying your Node.js and npm installations and whether you have installed yarn correctly.
diff --git a/docs/pages/Extensions/react-widget.md b/docs/pages/Extensions/react-widget.md
@@ -14,9 +14,9 @@ import "docsgpt/dist/style.css";
 Then you can use it like this: `<DocsGPTWidget />`
 
 DocsGPTWidget takes 3 props:
-- `apiHost` — URL of your DocsGPT API.
-- `selectDocs` — documentation that you want to use for your widget (e.g. `default` or `local/docs1.zip`).
-- `apiKey` — usually it's empty.
+1. `apiHost` — URL of your DocsGPT API.
+2. `selectDocs` — documentation that you want to use for your widget (e.g. `default` or `local/docs1.zip`).
+3. `apiKey` — usually it's empty.
 
 ### How to use DocsGPTWidget with [Nextra](https://nextra.site/) (Next.js + MDX)
 Install your widget as described above and then go to your `pages/` folder and create a new file `_app.js` with the following content:

diff --git a/docs/pages/Guides/Customising-prompts.md b/docs/pages/Guides/Customising-prompts.md
@@ -1,4 +1,27 @@
-## To customize a main prompt, navigate to `/application/prompt/combine_prompt.txt`
+# Customizing the Main Prompt
 
-You can try editing it to see how the model responses.
+To customize the main prompt for DocsGPT, follow these steps:
+
+1. Navigate to `/application/prompt/combine_prompt.txt`.
+
+2. Edit the `combine_prompt.txt` file to modify the prompt text. You can experiment with different phrasings and structures to see how the model responds.
+
+## Example Prompt Modification
+
+**Original Prompt:**
+```markdown
+You are a DocsGPT, friendly and helpful AI assistant by Arc53 that provides help with documents. You give thorough answers with code examples if possible.
+Use the following pieces of context to help answer the users question. If its not relevant to the question, provide friendly responses.
+You have access to chat history, and can use it to help answer the question.
+When using code examples, use the following format:
+
+(code)
+{summaries}
+```
+
+
+
+## Conclusion
+
+Customizing the main prompt for DocsGPT allows you to tailor the AI's responses to your unique requirements. Whether you need in-depth explanations, code examples, or specific insights, you can achieve it by modifying the main prompt. Remember to experiment and fine-tune your prompts to get the best results.
 
diff --git a/docs/pages/Guides/How-to-train-on-other-documentation.md b/docs/pages/Guides/How-to-train-on-other-documentation.md
@@ -12,28 +12,28 @@ It currently uses OPEN_AI to create the vector store, so make sure your document
 You can usually find documentation on Github in `docs/` folder for most open-source projects.
 
 ### 1. Find documentation in .rst/.md and create a folder with it in your scripts directory
-- Name it `inputs/`  
-- Put all your .rst/.md files in there  
-- The search is recursive, so you don't need to flatten them
+- Name it `inputs/`.
+- Put all your .rst/.md files in there.  
+- The search is recursive, so you don't need to flatten them.
 
-If there are no .rst/.md files just convert whatever you find to .txt and feed it. (don't forget to change the extension in script)
+If there are no .rst/.md files just convert whatever you find to .txt file and feed it. (don't forget to change the extension in script)
 
 ### 2. Create .env file in `scripts/` folder
 And write your OpenAI API key inside
-`OPENAI_API_KEY=<your-api-key>`
+`OPENAI_API_KEY=<your-api-key>`.
 
 ### 3. Run scripts/ingest.py
 
 `python ingest.py ingest`
 
-It will tell you how much it will cost
+It will tell you how much it will cost.
 
 ### 4. Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder. 
 
 
 ### 5. Run web app
-Once you run it will use new context that is relevant to your documentation  
-Make sure you select default in the dropdown in the UI
+Once you run it will use new context that is relevant to your documentation.  
+Make sure you select default in the dropdown in the UI.
 
 ## Customization 
 You can learn more about options while running ingest.py by running:

diff --git a/docs/pages/Guides/How-to-use-different-LLM.md b/docs/pages/Guides/How-to-use-different-LLM.md
@@ -1,10 +1,10 @@
-Fortunately, there are many providers for LLM's and some of them can even be run locally
+Fortunately, there are many providers for LLMs, and some of them can even be run locally.
 
 There are two models used in the app:
 1. Embeddings.
 2. Text generation.
 
-By default, we use OpenAI's models but if you want to change it or even run it locally, it's very simple!
+By default, we use OpenAI's models, but if you want to change it or even run it locally, it's very simple!
 
 ### Go to .env file or set environment variables:
 
@@ -31,6 +31,6 @@ Alternatively, if you wish to run Llama locally, you can run `setup.sh` and choo
 That's it!
 
 ### Hosting everything locally and privately (for using our optimised open-source models)
-If you are working with important data and don't want anything to leave your premises.
+If you are working with critical data and don't want anything to leave your premises.
 
-Make sure you set `SELF_HOSTED_MODEL` as true in your `.env` variable and for your `LLM_NAME` you can use anything that's on Hugging Face.
+Make sure you set `SELF_HOSTED_MODEL` as true in your `.env` variable, and for your `LLM_NAME`, you can use anything that is on Hugging Face.
diff --git a/frontend/src/About.tsx b/frontend/src/About.tsx
@@ -4,7 +4,7 @@
 export default function About() {
   return (
     <div className="mx-5 grid min-h-screen md:mx-36">
-      <article className="place-items-left mx-auto my-auto mt-20 flex w-full max-w-6xl flex-col gap-6 rounded-3xl bg-gray-100 p-6 text-jet lg:p-10 xl:p-16">
+      <article className="place-items-left mx-auto my-auto flex w-full max-w-6xl flex-col gap-4 rounded-3xl bg-gray-100 p-6 text-jet lg:p-6 xl:p-10">
         <div className="flex items-center">
           <p className="mr-2 text-3xl">About DocsGPT</p>
           <p className="text-[21px]">🦖</p>
@@ -54,7 +54,7 @@ export default function About() {
           Currently It uses{' '}
           <span className="text-blue-950 font-medium">DocsGPT</span>{' '}
           documentation, so it will respond to information relevant to{' '}
-          <span className="text-blue-950 font-medium">DocsGPT</span> . If you
+          <span className="text-blue-950 font-medium">DocsGPT</span>. If you
           want to train it on different documentation - please follow
           <a
             className="text-blue-500"

diff --git a/frontend/src/Navigation.tsx b/frontend/src/Navigation.tsx
@@ -362,7 +362,7 @@ export default function Navigation({ navOpen, setNavOpen }: NavigationProps) {
           </a>
         </div>
       </div>
-      <div className="fixed h-16 w-full border-b-2 bg-gray-50 md:hidden">
+      <div className="fixed z-10 h-16 w-full border-b-2 bg-gray-50 md:hidden">
         <button
           className="mt-5 ml-6 h-6 w-6 md:hidden"
           onClick={() => setNavOpen(true)}
Original file line number	Diff line number	Diff line change
Expand Up		@@ -50,4 +50,4 @@ yarn dev

		- Now, you should be able to view the docs on your local environment by visiting `http://localhost:5000`. You can explore the different markdown files and make changes as you see fit.

		- Footnotes: This guide assumes you have Node.js and npm installed. The guide involves running a local server using yarn, and viewing the documentation offline. If you encounter any issues, it may be worth verifying your Node.js and npm installations and whether you have installed yarn correctly.
		- Footnotes: This guide assumes you have Node.js and npm installed. The guide involves running a local server using yarn, and viewing the documentation offline. If you encounter any issues, it may be worth verifying your Node.js and npm installations and whether you have installed yarn correctly.