Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR: Vector store for RAG #168

Merged
merged 2 commits into from
Jan 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,10 @@ dictionary.dic

# python virtualenv
venv

# Emacs
anastasds marked this conversation as resolved.
Show resolved Hide resolved
*~
\#*\#
.\#*
.projectile
.dir-locals.el
1 change: 1 addition & 0 deletions .spellcheck-en-custom.txt
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ Kumar
Langchain
Langgraph
leaderboard
lifecycle
lignment
LLM
LLMs
Expand Down
31 changes: 31 additions & 0 deletions docs/rag/adrs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Architecture Decision Records

The ADR is a lightweight record format intended to capture individual architecturally important decisions. They are meant to be easy to write - 10 minutes or less. They should be stored in the codebase they affect, go through peer review, and have a commit history.

This simple format, which is described below, has a surprising number of functions:

* **Decision making process**: by going through peer review, it includes the entire team and gives all perspectives a chance to be heard. There is a clear decision making process with a clear lifecycle - once an ADR meets whatever approval criteria the team chooses, it is merged and the decision is done. If new information comes to light that causes the team to reconsider the decision, then that is simply a new ADR.
* **Institutional knowledge and transparency**: Not everyone will comment on every ADR, but the transparency of the mechanism should serve to keep everyone informed and encode tribal knowledge into writing. This also builds resilience - there should ideally never be decision making that is blocked by someone being sick or on vacation. The team should always be able to make significant decisions.
* **Distribute design authority**: As a team becomes familiar and comfortable with the ADR mechanism, every team member has an equal tool to bring design decisions to the team. This encourages autonomy, accountability, and ownership.
* **Onboarding and training material**: A natural consequence of it being easy to write an ADR and getting into the habit of doing so is that new team members can simply read the record of ADRs to onboard.
* **Knowledge sharing**: The peer review phase allows sharing of expertise between team members.
* **Fewer meetings**: As decision making becomes asynchronous and as the team forms its social norms around the process, there should be less time required in meetings.

## When to write an ADR

* A decision is being made that required discussion between two or more people.
* A decision is being made that required significant investigation.
* A decision is being proposed for feedback / discussion.
* A decision is being proposed that affects multiple teams.

## Template

[Here](template.md).

## Related Reading

* [Suggestions for writing good ADRs](https://github.com/joelparkerhenderson/architecture-decision-record?tab=readme-ov-file#suggestions-for-writing-good-adrs)
* [ADRs at RedHat](https://www.redhat.com/architect/architecture-decision-records)
* [ADRs at Amazon](https://docs.aws.amazon.com/prescriptive-guidance/latest/architectural-decision-records/adr-process.html)
* [ADRs at GitHub](https://adr.github.io/)
* [ADRs at Google](https://cloud.google.com/architecture/architecture-decision-records)
26 changes: 26 additions & 0 deletions docs/rag/adrs/adr-vectordb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Initial InstructLab Vector Store

## Context

One of the first choices to make in implementing RAG is to choose an initial vector store to develop against. Though the usage of frameworks like LangChain or Haystack make it easy to swap vector databases, we need a working end to end implementation for RAG that is tested against and available to install with InstructLab. There are many options (see [here](https://docs.haystack.deepset.ai/docs/choosing-a-document-store)).

Our main long-term requirements are that our chosen store have fully-developed document update (and thus some sort of notion of primary key), that it be scalable to cluster size, and that it have a permissive license (Apache, MIT, or similar). Among the available choices, [Milvus](https://milvus.io/) provides strategic advantage due to its [investment from watsonx](https://www.ibm.com/new/announcements/ibm-watsonx-data-vector-database-ai-ready-data-management).

Milvus can be used in-process ([Milvus Lite](https://milvus.io/docs/milvus_lite.md)), single-node ([Milvus](https://milvus.io/docs/prerequisite-docker.md)), or cluster-scale ([Milvus Distributed](https://milvus.io/docs/prerequisite-helm.md)).

## Decision

InstructLab will initially integrate with and use Milvus Lite for vector storage and retrieval augmented generation.

## Status

Accepted

## Consequences

* Users will have a clear [upgrade path](https://milvus.io/docs/upgrade_milvus_cluster-operator.md) from the laptop use case to cluster scale.
* We should be able to have access to expert resources with Milvus via IBM.
* The laptop use case of InstructLab will have a minimally resource intensive option for prototyping.
* Since Milvus is used in watsonx, we can have confidence that it can meet expected scaling requirements.
* Document updates can be accommodated using well-established [primary key functionality](https://milvus.io/docs/primary-field.md) and [partition key](https://milvus.io/docs/use-partition-key.md).
* There is a risk of developing against a mature vector store leading to usage of functionality not available in some other vector store that a potential customer requires to be used.
17 changes: 17 additions & 0 deletions docs/rag/adrs/template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Succinct title

## Context

_What is the context of this decision? What are the technical, social, and political factors? For example, the decision to use a particular library might be simply because most of the team is familiar with it; that is a social context. A political factor might be influences from other teams or executive decisions_

## Decision

_a single decision statement, written in active voice, stated in a single sentence_

## Status

[Proposed | Accepted | Rejected ]

## Consequences

_A bulleted list and might be the most important section. What are the consequences of this decision? Does it introduce design constraints into a codebase? Does it require further decisions or investigations to be made? Will it require training/onboarding for team members? Does it impact performance? What about cost? Does it impact development processes? What else? As a rule of thumb, there should usually be 4-6 identified consequences_
Loading