-
Notifications
You must be signed in to change notification settings - Fork 122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibbs-cullen Thank you very much for your efforts!!
Please see a few comments and suggestions.
I guess most of the comments probably go back to @miararoy's original phrasing. But some of the new additions might also be a bit inaccurate \ misleading.
README.md
Outdated
* **Easy to implement:** Bring your text data in Parquet or JSONL format, and Canopy will handle the rest. Canopy is currently compatible with any OpenAI API endpoint. | ||
* **Reliable at scale:** Build fast, highly accurate GenAI applications that are production-ready and backed by Pinecone’s vector database. Seamlessly scale to billions of items with transarent, resource-based pricing. | ||
* **Open and flexible:** Fully open-source, Canopy is both modular and extensible. Deploy as a service or a library, and choose the components you need. Easily incorporate it into existing OpenAI applications and connect Canopy to your preferred UI. | ||
* **Interactive and iterative:** Chat with your text data using a simple command in the Canopy CLI. Easily compare RAG vs. non-RAG workflows side-by-side to interactively evaluate the augmented results before scaling to production. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should emphasize in our phrasing that as a development tool, you can use the CLI to experiment with the chat service.
We got feedback from our internal reviewer that they understood that Canopy is a CLI tool, where in fact it isn't (it's a chatbot backend).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 we should focus the evaluative element as to be more 'Interactive Evaluation' - Evaluate your RAG workflow with a CLI based chat debug tool.
|
||
By enhancing language models with access to unlearned knowledge and inifinite memory we can build AI applications that can answer questions and assist humans without the risk of hallucinating or generating fake content. Let's learn how Canopy executes RAG pipeline. | ||
Learn how Canopy implemenets the full RAG workflow to prevent hallucinations and augment you LLM (via an OpenAI endpoint) with your own text data. | ||
|
||
![](.readme-content/rag_flow.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I still believe this drawing is too complex for our front page. We should make a much simpler one, highlighting the key features, and save the actual detailed flow to the "advanced" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; we still need to updated the diagram
* **ChatEngine** _`/chat/completions`_ - is a complete RAG unit that exposes a chat interface of LLM augmented with retrieval engine. | ||
* **ContextEngine** _`/context/query`_ - is a proxy between your application and Pinecone. It will handle the R in the RAG pipeline and will return the snippet of context along with the respected source. | ||
* **KnowledgeBase** _`/context/{upsert, delete}` - is the data managment interface, handles the processing, chunking and encoding (embedding) of the data, along with Upsert and Delete operations | ||
1. **Canopy Core Library** - Canopy has 3 API level components that are responsible for different parts of the RAG workflow: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're intermixing classes
and API
s here, while the latter doesn't belong.
The core library has python "Classes" like ContextEngine
.
The Canopy Server
has "API endpoints", like /context/query
. These don't belong in the section describing the core library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted, we can separate those
so for the server, you can run any of these "classes" vis the api endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make some changes and commit them to this branch, if you don't mind.
README.md
Outdated
* **ChatEngine** _`/chat/completions`_ - implements the full RAG workflow and exposes a chat interface to interact with your data. It acts as a wrapper around the Knowledge Base and Context Engine. | ||
* **ContextEngine** _`/context/query`_ - performs the “retrieval” part of RAG. It rewrites and transforms your queries into query embeddings before finding the most relevant results (including citations) from Pinecone to pass along to your LLM prompt (via an OpenAI endpoint). | ||
|
||
* **KnowledgeBase** _`/context/{upsert, delete}` - prepares your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings before upserting them into the Pinecone vector database. It also handles Delete operations. | ||
|
||
> more information about the Core Library usage can be found in the [Library Documentation](docs/library.md) | ||
|
||
2. **Canopy Service** - a webservice that wraps the **Canopy Core** and exposes it as a REST API. The service is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should adopt the term "Canopy Server" rather than "Canopy Service".
That's also how it's called in the code.
@miararoy we should probably change the CLI prints accordingly
## Considerations | ||
|
||
* Canopy is currently only compatiable with OpenAI API endpoints for both the embedding model and the LLM. Rate limits and pricing set by OpenAI will apply. | ||
|
||
|
||
## Setup | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's further down the README - but I think we need to put a hard stop between steps 3 and 4 of the "Quick start".
After step 3, you have a functioning, ready made Canopy server. We should stop there and say something like "your server is ready to be deployed as a Chatbot backend!".
Then the "Chat with your data" should become an optional step (not part of the actual Quickstart) - saying something like, "to immediately explore and experiment with your server, you can use the built-in Chat tool".
Otherwise, again, the REAME might create the impression that Canopy itself is a CLI tool, meant to be used from CLI (which is counter-productive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i'm going to leave the quickstart part to you plus Byron
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - I agree. I think Step 3 is when your canopy server is ready to go. Left this feedback in my notes as well.
The following section can explain how to evaluate, it is not a step 4 in getting started, per se.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, nothing to add on @igiloh-pinecone
I approve
* **ChatEngine** _`/chat/completions`_ - implements the full RAG workflow and exposes a chat interface to interact with your data. It acts as a wrapper around the Knowledge Base and Context Engine. | ||
* **ContextEngine** _`/context/query`_ - performs the “retrieval” part of RAG. It rewrites and transforms your queries into query embeddings before finding the most relevant results (including citations) from Pinecone to pass along to your LLM prompt (via an OpenAI endpoint). | ||
|
||
* **KnowledgeBase** _`/context/{upsert, delete}` - prepares your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings before upserting them into the Pinecone vector database. It also handles Delete operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The KnowledgeBase
is also responsible for part of the retrieval.
Given a textual query, the KnowledgeBase
is responsible for retrieving the most relevant document chunks. Then the ContextEngine
is responsible for aggregating all this retrieved information in to one coherent textual context.
(I'm not sure if we should mention that here, that might be too nuanced. But it is also inaccurate to mention the KnowledgeBase
as only doing upsert
, since it's a crucial part of the query
process as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add more of those details here; these blurbs were from the blog so more high level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so chat engine also transforms queries into query embeddings?
Applied the suggestions I made.
033c646
to
35c437c
Compare
Problem
Describe the purpose of this change. What problem is being solved and why?
Solution
Describe the approach you took. Link to any relevant bugs, issues, docs, or other resources.
Type of Change
Test Plan
Describe specific steps for validating this change.