Valquery

The Generation-Assisted Retrieval (GAR) tool for analytics teams.

It currently consumes dbt (GitLab or GitHub) and Notion databases, search each for natural language references to data resources in Snowflake. In dbt repos these References can come from:

Code comments
Commit messages

In Notion it can be anything in a document.

When it finds them, it saves the Reference in a vector database with the table name(s) saved in metadata, along with other useful information.

When the end user has a question, they can use the Next.js application to ask it. The question is used to vector search against all of those References, which are aggregated down in the most relevant data resources, or Results. We then get some sample data from Snowflake for each of these and, along with the original question, feed them into an LLM. If the LLM thinks the information provided is enough to answer the question, we return the Results along with an explanation. If not, the LLM formulates a follow-up question which goes back into Valquery, and this continues until we have found enough data to answer the question. The LLM can also explain how to use an individual result to answer the question.

GAR?

Valquery is first a foremost a search tool. It provides LLM-generated explanations of the results it provides, but ultimately the LLM is working searvice of search, rather than the other way around. Thus, not RAG, but GAR.

Getting Started

Prereqs

Reference sources (you'll need at least one of these):

GitHub
GitLab
Notion

Result source:

Snowflake

Vector DB:

Pinecone

LLM:

OpenAI GPT-3.5-Turbo Moving it to OpenRouter is high on the todo list so people can use whatever model they want easily.

You'll a Postgres DB and a Redis DB as well, for table name lookups and queueing respectively.

It's written using GPT-3.5-Turbo for LLM support.

Instructions

Spin up the site somewhere, locally works fine.
Copy .env.template, rename it, and fill it up with your credentials.
Use the snowflake endpoints to populate the postgres database with all the relevant information about your warehouse.
Use the intake endpoints to populate the pinecone database:
- getGitHub
- getGitLab
- getNotion
Refer to the Postman collection for the API Documentation
Run the app and do some querying! I removed all of the auth from the SaaS version, but might add it back in to make it easier for people to host. As it is I would either run it locally or on an environment that's already behind auth.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
public		public
src		src
.env.template		.env.template
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.d.ts		environment.d.ts
jest.config.js		jest.config.js
next.config.js		next.config.js
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
types.d.ts		types.d.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Valquery

GAR?

Getting Started

Prereqs

Instructions

About

Releases

Packages

Languages

License

Battjmo/OpenValquery

Folders and files

Latest commit

History

Repository files navigation

Valquery

GAR?

Getting Started

Prereqs

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages