A fast & fun way to build portable cloud-native applications
Query a custom knowledge base by using vector simliarity queries on a vector database (supabase).
- Generate embeddings for sections of text in the Nitric documentation which has been written in markdown.
- Store embeddings in database
- Query embeddings by doing a simliarity test on vectorized query with database entries
- Process all results with OpenAPI completion model for human friendly output in MD format.
This sample project inspires from the supabase ClippyGPT, making it possible to search Nitric base documenation, or BYO markdown content via an API.
Follow the steps in the installation guide
Create the following tables:
page -
Name | Description | Data Type | Format |
---|---|---|---|
id | No description | bigint | int8 |
path | No description | text | text |
page_section -
Name | Description | Data Type | Format |
---|---|---|---|
id | No description | bigint | int8 |
content | No description | text | text |
token_count | No description | bigint | int8 |
embedding | No description | USER-DEFINED | vector |
page_id | No description | bigint | int8 |
heading | No description | text | text |
Name | Type |
---|---|
embedding | |
match_threshold | |
match_count | integer |
min_content_length | integer |
#variable_conflict use_variable
begin
return query
select
page.path,
page_section.content,
(page_section.embedding <#> embedding) * -1 as similarity
from page_section
join page
on page_section.page_id = page.id
-- We only care about sections that have a useful amount of content
where length(page_section.content) >= min_content_length
-- The dot product is negative because of a Postgres limitation, so we negate it
and (page_section.embedding <#> embedding) * -1 > match_threshold
-- OpenAI embeddings are normalized to length 1, so
-- cosine similarity and dot product will produce the same results.
-- Using dot product which can be computed slightly faster.
--
-- For the different syntaxes, see https://github.com/pgvector/pgvector
order by page_section.embedding <#> embedding
limit match_count;
end;
You'll need an open-api access key and a supabase database url and access key which can be found in their respective portals.
OPENAI_SECRET_KEY=
NEXT_PUBLIC_SUPABASE_URL=
SUPABASE_SECRET_KEY=
Refer to the README located in the language specific version of this project.
Populate the 'pages' subdirectory with your documentation. We've left our top level docs in there as a sample.
The code splits the pages into sections based on headings using ##
Learn
curl --location 'localhost:4001/learn'
Query
curl --location 'localhost:4001/query'
--header 'Content-Type: text/plain'
--data '{
"query" : "what is nitric?"
}'
Nitric is a framework for rapid development of cloud-native and serverless applications. Define your apps in terms of the resources they need, then write the code for serverless function based APIs, event subscribers and scheduled jobs.
Apps built with Nitric can be deployed to AWS, Azure or Google Cloud all from the same code base so you can focus on your products, not your cloud provider.
Nitric makes it easy to:
- Create smart serverless functions and APIs
- Build reliable distributed apps that use events and/or queues
- Securely store and retrieve secrets
- Read and write files from buckets
The full documentation is available at nitric.io/docs.
We're completely open-source and encourage code contributions.
-
Ask questions in GitHub discussions
-
Find us on Twitter
-
Send us an email
-
Jump into our Discord server