Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

Update README.md #135

Merged
merged 1 commit into from
Nov 2, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 10 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Canopy

**Canopy** is an open-source Retrieval Augmented Generation (RAG) framework built on top of the Pinecone vector database. Canopy enables developers to quickly and easily experiment with and build applications using Retrieval Augmented Generation (RAG).
Canopy provides a configurable built-in server that allows users to effortlessly deploy a RAG-infused Chatbot web app using their own documents as a knowledge base.
For advanced use cases, the canopy core library enables building your own custom retrieval-powered AI applications.
**Canopy** is an open-source Retrieval Augmented Generation (RAG) framework and context engine built on top of the Pinecone vector database. Canopy enables you to quickly and easily experiment with and build applications using RAG. Start chatting with your documents or text data with a few simple commands.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start chatting with your documents or text data with a few simple commands.

Sounds like shoppingTV


Canopy provides a configurable built-in server so you can effortlessly deploy a RAG-powered chat application to your existing chat UI or interface. Or you can build your own, custom RAG application using the Canopy lirbary.

Canopy is desinged to be:
* **Easy to implement:** Bring your text data in Parquet or JSONL format, and Canopy will handle the rest. Canopy makes it easy to incorporate RAG into your OpenAI chat applications.
* **Reliable at scale:** Build fast, highly accurate GenAI applications that are production-ready and backed by Pinecone’s vector database. Seamlessly scale to billions of items with transarent, resource-based pricing.
* **Open and flexible:** Fully open-source, Canopy is both modular and extensible. You can configure to choose the components you need, or extend any component with your own custom implementation. Easily incorporate it into existing OpenAI applications and connect Canopy to your preferred UI.
* **Interactive and iterative:** Evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side to evaluate the augmented results before scaling to production.
* **Open and flexible:** Fully open-source, Canopy is both modular and extensible. You can configure to choose the components you need, or extend any component with your own custom implementation.
* **Interactive and iterative:** Evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side.

## RAG with Canopy

Expand Down Expand Up @@ -48,15 +48,15 @@ Learn how Canopy implemenets the full RAG workflow to prevent hallucinations and

## What's inside the box?

1. **Canopy Core Library** - Canopy has 3 API level components that are responsible for different parts of the RAG workflow:
* **ChatEngine** _`/chat/completions`_ - implements the full RAG workflow and exposes a chat interface to interact with your data. It acts as a wrapper around the Knowledge Base and Context Engine.
* **ContextEngine** - performs the “retrieval” part of RAG. The `ContextEngine` utilizes the underlying `KnowledgeBase` to retrieve the most relevant document chunks, then formulates a coherent textual context to be used as a prompt for the LLM.
1. **Canopy Core Library** - The library has 3 components or classes that are responsible for different parts of the RAG workflow:
* **ChatEngine** (_`/chat/completions`_) implements the full RAG workflow. It rewrites and transforms your queries into query embeddings before generating augmented search results (via the Context Engine) before returning them back to the end user.
* **ContextEngine** performs the “retrieval” part of RAG. The `ContextEngine` utilizes the underlying `KnowledgeBase` to retrieve the most relevant document chunks, then formulates a coherent textual context to augment the prompt for the LLM (via an OpenAI API endpoint).

* **KnowledgeBase** _`/context/{upsert, delete}` - prepares your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings before upserting them into the Pinecone vector database. It also handles Delete operations.
* **KnowledgeBase** (_`/context/{upsert, delete}`) prepares your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings before upserting them into the Pinecone vector database. It also handles Delete operations.
Comment on lines +52 to +55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibbs-cullen I've actually already changed these descriptions yesterday, in my own PR.
Can you take a look please? I the new phrasing is more accurate, conveying the actual responsibilities of each of these components.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, didn't see that. Feel free to move forward with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ie. your other PR


> more information about the Core Library usage can be found in the [Library Documentation](docs/library.md)

2. **Canopy Service** - a webservice that wraps the **Canopy Core** and exposes it as a REST API. The service is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. The service also comes with a built in Swagger UI for easy testing and documentation. After you [start the server](#3-start-the-canopy-service), you can access the Swagger UI at `http://host:port/docs` (default: `http://localhost:8000/docs`)
2. **Canopy Service** - This is a webservice that wraps the **Canopy Core** library and exposes it as a REST API. The service is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. The service also comes with a built in Swagger UI for easy testing and documentation. After you [start the server](#3-start-the-canopy-service), you can access the Swagger UI at `http://host:port/docs` (default: `http://localhost:8000/docs`)

3. **Canopy CLI** - A built-in development tool that allows users to swiftly set up their own Canopy server and test its configuration.
With just three CLI commands, you can create a new Canopy service, upload your documents to it, and then interact with the Chatbot using a built-in chat application directly from the terminal. The built-in chatbot also enables comparison of RAG-infused responses against a native LLM chatbot.
Expand All @@ -65,7 +65,6 @@ With just three CLI commands, you can create a new Canopy service, upload your d

* Canopy is currently only compatiable with OpenAI API endpoints for both the embedding model and the LLM. Rate limits and pricing set by OpenAI will apply.


## Setup

0. set up a virtual environment (optional)
Expand Down