Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

Commit

Permalink
Merge branch 'pinecone-io:main' into qdrant-knowledge-base
Browse files Browse the repository at this point in the history
  • Loading branch information
Anush008 authored Feb 22, 2024
2 parents 2895775 + 50414f5 commit c10416e
Show file tree
Hide file tree
Showing 77 changed files with 2,483 additions and 376 deletions.
6 changes: 6 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
blank_issues_enabled: true
contact_links:
- name: 🤔 Ask a Question
url: 'https://github.com/pinecone-io/canopy/discussions/new?category=q-a'
about: Ask a question about how to use Canopy using GitHub discussions

5 changes: 2 additions & 3 deletions .github/actions/install-deps-and-canopy/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ inputs:
description: "Whether to install canopy library, or dependencies only"
required: true
default: "true"

runs:
using: "composite"
steps:
Expand Down Expand Up @@ -37,8 +36,8 @@ runs:
- name: Install dependencies
shell: bash
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction --no-root --all-extras --with dev
run: make install-extras POETRY_INSTALL_ARGS="--no-interaction --no-root --with dev"
- name: Install project
if: ${{ inputs.install-canopy == 'true' }}
shell: bash
run: poetry install --no-interaction --all-extras --with dev
run: make install-extras POETRY_INSTALL_ARGS="--with dev --no-interaction"
7 changes: 7 additions & 0 deletions .github/workflows/build-push-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,19 @@ jobs:
type=semver,pattern={{version}},enable=${{ github.event_name == 'push' }}
type=raw,value=latest,enable=${{ github.event_name != 'push' }}
type=raw,value=${{inputs.version}},enable=${{ github.event_name != 'push' }}
- name: Create build args
run: |
export POETRY_INSTALL_ARGS="$(make print-var VAR=POETRY_DEFAULT_EXTRAS)"
echo "POETRY_INSTALL_ARGS=$POETRY_INSTALL_ARGS" >> $GITHUB_OUTPUT
id: build-args
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64
push: true
build-args: |
POETRY_INSTALL_ARGS=${{steps.build-args.outputs.POETRY_INSTALL_ARGS}}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
provenance: false
Expand Down
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,36 @@
## [0.8.0] - 2024-02-15
### Breaking changes
* Added support for Pydantic v2 [#288](https://github.com/pinecone-io/canopy/pull/288)

**Full Changelog**: https://github.com/pinecone-io/canopy/compare/v0.7.0...v0.8.0

## [0.7.0] - 2024-02-15
### Breaking changes
* Move config directory to be part of the canopy package [#278](https://github.com/pinecone-io/canopy/pull/278)

### Bug fixes
* Fix building images on release [#252](https://github.com/pinecone-io/canopy/pull/252)
* Exporting the correct module CohereRecordEncoder [#264](https://github.com/pinecone-io/canopy/pull/264) (Thanks @tomaarsen!)
* Fixed GRPC support [#270](https://github.com/pinecone-io/canopy/pull/270)
* Change the minimum version of FastAPI to 0.93.0 [#279](https://github.com/pinecone-io/canopy/pull/279)
* Reduce the docker image size [#277](https://github.com/pinecone-io/canopy/pull/277)

### Added
* Generalize chunk creation [#258](https://github.com/pinecone-io/canopy/pull/258)
* Add SentenceTransformersRecordEncoder [#263](https://github.com/pinecone-io/canopy/pull/263) (Thanks @tomaarsen!)
* Add HybridRecordEncoder [#265](https://github.com/pinecone-io/canopy/pull/265)
* Make transformers optional & allow pinecone-text with dense optional [#266](https://github.com/pinecone-io/canopy/pull/266)
* Add cohere reranker [#269](https://github.com/pinecone-io/canopy/pull/269)
* Add dimension support for OpenAI embeddings [#273](https://github.com/pinecone-io/canopy/pull/273)
* Include config template files inside the package and add a CLI command to dump them [#287](https://github.com/pinecone-io/canopy/pull/287)

### Documentation
* Add contributing guide [#254](https://github.com/pinecone-io/canopy/pull/254)
* Update README [#267](https://github.com/pinecone-io/canopy/pull/267) (Thanks @aulorbe!)
* Fixed typo in dense.py docstring [#280](https://github.com/pinecone-io/canopy/pull/280) (Thanks @ptorru!)

**Full Changelog**: https://github.com/pinecone-io/canopy/compare/v0.6.0...v0.7.0

## [0.6.0] - 2024-01-16
### Breaking changes
* Pinecone serverless support [#246](https://github.com/pinecone-io/canopy/pull/246)
Expand Down
157 changes: 157 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Contributing to Canopy
Thank you for considering contributing to Canopy! We appreciate the time and effort you put
into making this project better. Following these guidelines will help streamline the process
and make it easier for both contributors and maintainers.


## Issues
If you encounter any [issues](https://github.com/pinecone-io/canopy/issues/new/choose) while using the project, please report them.
Include a detailed description of the problem, steps to reproduce it, and your environment details.

For any question, please use the `Discussions` section rather than opening an issue. This helps keep the issue tracker
focused on bugs and feature requests.

## Feature requests
If you have a feature request, please open an issue and describe the feature you would like to see, using the "Feature request" template.

## Contributing code

It is really simple to get started and create a pull request. Canopy is released regularly, so, you should see your
improvements released in a matter of days or weeks 🚀

If this is your first contribution to Canopy, you can start by looking at issues with the
["good first issue"](https://github.com/pinecone-io/canopy/issues?q=is:issue+is:open+label:%22good+first+issue%22)
label on GitHub.
If you find an issue that you'd like to work on, please assign the issue to yourself and leave a comment to let others know that you are working on it. Feel free to start a discussion on the issue to discuss optional designs or approaches.

### Building from source
If you are planning to contribute to Canopy, you will need to create your own fork of the repository.
If you just want to test the code locally, you can clone the repository directly.

1. Fork the repository on GitHub and clone your fork locally.

```bash
# Clone your fork and cd into the repo directory
git clone [email protected]:<your username>/canopy.git
cd canopy
```
2. Install poetry, which is required for dependency management. It is recommended to install poetry in a virtual environment.
You can install poetry using pip
```bash
pip install poetry
```
or using the following command

```bash
# Install poetry
curl -sSL https://install.python-poetry.org | python3 -
```
3. Install the dependencies and dev dependencies
```bash
# Install canopy, dependencies and dev dependencies
poetry install --with dev
```
4. Set up accounts and define environment variables
Please refer to the [README](./README.md#mandatory-environment-variables) for more details.
5. Remember to activate the virtual environment before running any commands
```bash
# Activate the virtual environment
poetry shell
```
or alternatively, you can run the commands directly using `poetry run`
```bash
# Run the command inside the virtual environment
poetry run <command>
```
#### Optional - installing extra dependencies
Canopy has a few optional dependencies, mostly for additional service providers. If you want to use Canopy with any of these providers, please make sure to install the relevant extra. For example, to use Canopy with Cohere, you should install with:
```bash
# Install canopy, with the cohere extra
poetry install --with dev --extras cohere
```

### Running tests
Canopy uses unit tests, system tests and end-to-end tests. Unit tests verify the functionality of each code module, without any external dependencies. System tests verify integration with services like Pinecone DB and OpenAI API. End-to-End tests verify the functionality of the entire Canopy server.
System and end-to-end tests require valid API keys for Pinecone and Open AI. Some optional providers require additional environment variables, and are otherwise skipped.
You can create a single `.env` file in the root directory of the repository and set all the environment variables there.

To run all tests, run the following command:
```bash
# Run all tests
poetry run pytest tests/
```
You can also run only one type of tests using the following commands:
```bash
# Run unit tests
poetry run pytest tests/unit
# Run system tests
poetry run pytest tests/system
# Run end-to-end tests
poetry run pytest tests/e2e
```

### Check out a new branch and make your changes
Create a new branch for your changes.

```bash
# Checkout a new branch and make your changes
git checkout -b my-new-feature-branch
# Make your changes...
```

### Document your changes
When contributing to Canopy, please make sure that all code is well documented.

The following should be documented using properly formatted docstrings:

- Modules
- Class definitions
- Function definitions
- Module-level variables

Canopy uses [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) formatted
according to [PEP 257](https://www.python.org/dev/peps/pep-0257/) guidelines.
(See [Example Google Style Python Docstrings](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html)
for further examples.)

[pydoclint](https://github.com/jsh9/pydoclint) is used for linting docstrings. You can run `make lint` to check your docstrings.

If you are making changes to the public API, please update the documentation in the README.md file.

### Add the relevant tests
All code changes to Canopy need to be covered by tests. After making your changes, make sure to add relevant unit tests.
Tests that require external integration (e.g. communication with an API or service) should be placed under the `tests/system/` directory.

Please make an effort to avoid code duplication. Some unit tests have a common base class that can be extended. Other tests use fixtures to parameterize test cases over several subclasses. Instead of copy-pasting other test cases, try to utilize these mechanisms as much as possible.

### Run linting, static type checking and unit tests
Run the following to make sure everything is working as expected:

```bash
# Run unit tests
make test-unit
# If you don't have make installed, you can run the following command instead
poetry run pytest tests/unit
# Lint the code
make lint
# Or alternatively
poetry run flake8 .
# Run static type checking
make static
# Or
poetry run mypy src
```
(There are a few more sub-commands in Makefile like which you might want to use. You can run `make help` to see more options.)

### Commit your changes, push to GitHub, and open a Pull Request

Commit your changes, push your branch to GitHub, the use GitHub's website to create a pull request.
Please follow the pull request template and fill in as much information as possible. Link to any relevant issues and include a description of your changes.
16 changes: 9 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

ARG PYTHON_VERSION=3.11.7
ARG PORT=8000
ARG POETRY_INSTALL_ARGS=""
################################
# PYTHON-BASE
# Sets up all our shared environment variables
Expand Down Expand Up @@ -63,9 +64,10 @@ WORKDIR /app
COPY pyproject.toml ./
RUN poetry lock

ARG POETRY_INSTALL_ARGS
# install runtime deps to VIRTUAL_ENV
RUN --mount=type=cache,target=/root/.cache \
poetry install --no-root --all-extras --only main
poetry install --no-root --only main $POETRY_INSTALL_ARGS


################################
Expand All @@ -78,13 +80,13 @@ WORKDIR /app
COPY --from=builder-base /app/pyproject.toml pyproject.toml
COPY --from=builder-base /app/poetry.lock poetry.lock


ARG POETRY_INSTALL_ARGS
# quicker install as runtime deps are already installed
RUN --mount=type=cache,target=/root/.cache \
poetry install --no-root --all-extras --with dev
poetry install --no-root --with dev $POETRY_INSTALL_ARGS

COPY . .
RUN poetry install --all-extras --only-root
RUN poetry install --only-root $POETRY_INSTALL_ARGS

ARG PORT
EXPOSE $PORT
Expand All @@ -101,7 +103,7 @@ FROM python-base as production
ENV WORKER_COUNT=1

LABEL org.opencontainers.image.source="https://github.com/pinecone-io/canopy"
LABEL org.opencontainers.image.description="Image containing the canopy server."
LABEL org.opencontainers.image.description="Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone"
LABEL org.opencontainers.image.licenses="Apache-2.0"

RUN DEBIAN_FRONTEND=noninteractive apt-get update && \
Expand All @@ -119,9 +121,9 @@ COPY --from=builder-base /app/pyproject.toml pyproject.toml
COPY --from=builder-base /app/poetry.lock poetry.lock

COPY src/ src/
COPY config/ config/
RUN touch README.md
RUN poetry install --all-extras --only-root
ARG POETRY_INSTALL_ARGS
RUN poetry install --only-root $POETRY_INSTALL_ARGS

ARG PORT
EXPOSE $PORT
Expand Down
Loading

0 comments on commit c10416e

Please sign in to comment.