Skip to content

Commit

Permalink
Replace black with ruff linter / formatter (#392)
Browse files Browse the repository at this point in the history
## Problem

While researching how to integrate some rust extensions into our python,
I heard about this alternative tool for code formatting that also does
linting. This tool is much faster than black and can also do codestyle
linting to enforce some best practices for python.

Ruff is implemented by [Astral](https://docs.astral.sh/), which is a
company aiming to create a "cargo for python" type experience with
python tools that are implemented in rust making them both fast and
reliable. I'm interested in some other stuff they are doing (e.g. uv for
dependency management), but this seemed like a lower stakes way to test
the waters with their stuff.

## Solution

Main changes are in:
- Dev dependency changes: add ruff, remove black
- Remove black configs in pyproject.toml. 
- Add ruff configs to pyproject.toml. Mostly stuck with defaults,
although I disabled a few of the more annoying lint rules for now on a
per-file basis.
- Adjust CI to run ruff checks instead of black.
- Update CONTRIBUTING

The rest of this large diff is due to formatting and lint fixes for
various code style things.

These checks can be run manually with `poetry run ruff check --fix` and
`poetry run ruff format`, but otherwise they should automatically be
triggered by pre-commit hooks. I documented this in CONTRIBUTING.

## Type of Change

- [x] Infrastructure change (CI configs, etc)

## Test Plan

Tests should still be green. No functional impact expected.
  • Loading branch information
jhamon authored Sep 25, 2024
1 parent a41b9f8 commit 587a5d5
Show file tree
Hide file tree
Showing 84 changed files with 1,109 additions and 832 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: psf/black@stable
- uses: chartboost/ruff-action@v1
12 changes: 8 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,12 @@ repos:
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/psf/black-pre-commit-mirror
rev: 24.4.2
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.6.7
hooks:
- id: black
language_version: python3.12
# Run the linter.
- id: ruff
args: [ --fix ]
# Run the formatter.
- id: ruff-format
135 changes: 78 additions & 57 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Contributing
# Contributing

## Installing development versions

Expand All @@ -17,45 +17,45 @@ poetry add git+https://github.com/pinecone-io/pinecone-python-client.git@44fc7ed
```


## Developing locally with Poetry
## Developing locally with Poetry

[Poetry](https://python-poetry.org/) is a tool that combines [virtualenv](https://virtualenv.pypa.io/en/latest/) usage with dependency management, to provide a consistent experience for project maintainers and contributors who need to develop the pinecone-python-client
as a library.
as a library.

A common need when making changes to the Pinecone client is to test your changes against existing Python code or Jupyter Notebooks that `pip install` the Pinecone Python client as a library.
A common need when making changes to the Pinecone client is to test your changes against existing Python code or Jupyter Notebooks that `pip install` the Pinecone Python client as a library.

Developers want to be able to see their changes to the library immediately reflected in their main application code, as well as to track all changes they make in git, so that they can be contributed back in the form of a pull request.
Developers want to be able to see their changes to the library immediately reflected in their main application code, as well as to track all changes they make in git, so that they can be contributed back in the form of a pull request.

The Pinecone Python client therefore supports Poetry as its primary means of enabling a consistent local development experience. This guide will walk you through the setup process so that you can:
The Pinecone Python client therefore supports Poetry as its primary means of enabling a consistent local development experience. This guide will walk you through the setup process so that you can:
1. Make local changes to the Pinecone Python client that are separated from your system's Python installation
2. Make local changes to the Pinecone Python client that are immediately reflected in other local code that imports the pinecone client
3. Track all your local changes to the Pinecone Python client so that you can contribute your fixes and feature additions back via GitHub pull requests

### Step 1. Fork the Pinecone python client repository

On the [GitHub repository page](https://github.com/pinecone-io/pinecone-python-client) page, click the fork button at the top of the screen and create a personal fork of the repository:
On the [GitHub repository page](https://github.com/pinecone-io/pinecone-python-client) page, click the fork button at the top of the screen and create a personal fork of the repository:

![Create a GitHub fork of the Pinecone Python client](./docs/pinecone-python-client-fork.png)

It will take a few seconds for your fork to be ready. When it's ready, **clone your fork** of the Pinecone python client repository to your machine.
It will take a few seconds for your fork to be ready. When it's ready, **clone your fork** of the Pinecone python client repository to your machine.

Change directory into the repository, as we'll be setting up a virtualenv from within the root of the repository.
Change directory into the repository, as we'll be setting up a virtualenv from within the root of the repository.

### Step 1. Install Poetry
### Step 1. Install Poetry

Visit [the Poetry site](https://python-poetry.org/) for installation instructions.
Visit [the Poetry site](https://python-poetry.org/) for installation instructions.

### Step 2. Install dependencies
### Step 2. Install dependencies

Run `poetry install` from the root of the project.
Run `poetry install` from the root of the project.

### Step 3. Activate the Poetry virtual environment and verify success

Run `poetry shell` from the root of the project. At this point, you now have a virtualenv set up in this directory, which you can verify by running:
Run `poetry shell` from the root of the project. At this point, you now have a virtualenv set up in this directory, which you can verify by running:

`poetry env info`

You should see something similar to the following output:
You should see something similar to the following output:

```bash
Virtualenv
Expand All @@ -73,17 +73,61 @@ Path: /home/linuxbrew/.linuxbrew/opt/[email protected]
```
If you want to extract only the path to your new virtualenv, you can run `poetry env info --path`

## Loading your virtualenv in another shell
### Step 4. Enable pre-commit hooks.

It's a common need when developing against this client to load it as part of some other application or Jupyter Notebook code, modify
it directly, see your changes reflected immediately and also have your changes tracked in git so you can contribute them back.
Run `poetry run pre-commit install` to enable checks to run when you commit so you don't have to find out during your CI run that minor lint issues need to be addressed.

It's important to understand that, by default, if you open a new shell or terminal window, or, for example, a new pane in a tmux session,
your new shell will not yet reference the new virtualenv you created in the previous step.
## Common tasks

### Running tests

- Unit tests: `make test-unit`
- Integration tests: `PINECONE_API_KEY="YOUR API KEY" make test-integration`
- Run the tests in a single file: `poetry run pytest tests/unit/data/test_bulk_import.py -s -vv`

### Running the ruff linter / formatter

These should automatically trigger if you have enabled pre-commit hooks with `poetry run pre-commit install`. But in case you want to trigger these yourself, you can run them like this:

```
poetry run ruff check --fix # lint rules
poetry run ruff format # formatting
```

If you want to adjust the behavior of ruff, configurations are in `pyproject.toml`.


### Consuming API version upgrades

These instructions can only be followed by Pinecone employees with access to our private APIs repository.

Prerequisites:
- You must be an employee with access to private Pinecone repositories
- You must have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running. Our code generation script uses a dockerized version of the OpenAPI CLI.
- You must have initialized the git submodules under codegen

```sh
git submodule
```

To regenerate the generated portions of the client with the latest version of the API specifications, you need to have Docker Desktop running on your local machine.

```sh
./codegen/
```


## Loading your virtualenv in another shell

It's a common need when developing against this client to load it as part of some other application or Jupyter Notebook code, modify
it directly, see your changes reflected immediately and also have your changes tracked in git so you can contribute them back.

It's important to understand that, by default, if you open a new shell or terminal window, or, for example, a new pane in a tmux session,
your new shell will not yet reference the new virtualenv you created in the previous step.

### Step 1. Get the path to your virtualenv

We're going to first get the path to the virtualenv we just created, by running:
We're going to first get the path to the virtualenv we just created, by running:

```bash
poetry env info --path
Expand All @@ -93,75 +137,52 @@ You'll get a path similar to this one: `/home/youruser/.cache/pypoetry/virtuale

### Step 2. Load your existing virtualenv in your new shell

Within this path is a shell script that lives at `<your-virtualenv-path>/bin/activate`. Importantly, you cannot simply run this script, but you
must instead source it like so:
Within this path is a shell script that lives at `<your-virtualenv-path>/bin/activate`. Importantly, you cannot simply run this script, but you
must instead source it like so:

```bash
source /home/youruser/.cache/pypoetry/virtualenvs/pinecone-fWu70vbC-py3.9/bin/activate
```
In the above example, ensure you're using your own virtualenv path as returned by `poetry env info --path`.

### Step 3. Test out your virtualenv
### Step 3. Test out your virtualenv

Now, we can test that our virtualenv is working properly by adding a new test module and function to the `pinecone` client within our virtualenv
and running it from the second shell.
Now, we can test that our virtualenv is working properly by adding a new test module and function to the `pinecone` client within our virtualenv
and running it from the second shell.

#### Create a new test file in pinecone-python-client
In the root of your working directory of the `pinecone-python-client` where you first ran `poetry shell`, add a new file named `hello_virtualenv.py` under the `pinecone` folder.
In the root of your working directory of the `pinecone-python-client` where you first ran `poetry shell`, add a new file named `hello_virtualenv.py` under the `pinecone` folder.

In that file write the following:
In that file write the following:

```python
def hello():
print("Hello, from your virtualenv!")
```
Save the file.
Save the file.

#### Create a new test file in your second shell
This step demonstrates how you can immediately test your latest Pinecone client code from any local Python application or Jupyter Notebook:
#### Create a new test file in your second shell
This step demonstrates how you can immediately test your latest Pinecone client code from any local Python application or Jupyter Notebook:

In your second shell, where you ran `source` to load your virtualenv, create a python file named `test.py` and write the following:
In your second shell, where you ran `source` to load your virtualenv, create a python file named `test.py` and write the following:

```python
from pinecone import hello_virtualenv

hello_virtualenv.hello()
```

Save the file. Run it with your Python binary. Depending on your system, this may either be `python` or `python3`:
Save the file. Run it with your Python binary. Depending on your system, this may either be `python` or `python3`:

```bash
python3 test.py
```

You should see the following output:
You should see the following output:

```bash
❯ python3 test.py
Hello, from your virtualenv!
```

If you experience any issues please [file a new issue](https://github.com/pinecone-io/pinecone-python-client/issues/new).


## Consuming API version upgrades

These instructions can only be followed by Pinecone employees with access to our private APIs repository.

Prerequisites:
- You must be an employee with access to private Pinecone repositories
- You must have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running. Our code generation script uses a dockerized version of the OpenAPI CLI.
- You must have initialized the git submodules under codegen

```sh
git submodule
```


To regenerate the generated portions of the client with the latest version of the API specifications, you need to have Docker Desktop running on your local machine.



```sh
./codegen/
```
12 changes: 6 additions & 6 deletions codegen/build-oas.sh
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ generate_client() {

oas_file="codegen/apis/_build/${version}/${module_name}_${version}.oas.yaml"
package_name="pinecone.${py_module_name}.openapi.${module_name}"

verify_file_exists $oas_file
verify_directory_exists $template_dir

Expand Down Expand Up @@ -106,9 +106,9 @@ extract_shared_classes() {
# Define the list of shared source files
sharedFiles=(
"api_client"
"configuration"
"exceptions"
"model_utils"
"configuration"
"exceptions"
"model_utils"
"rest"
)

Expand All @@ -127,7 +127,7 @@ extract_shared_classes() {
done
done

# Remove the docstring headers that aren't really correct in the
# Remove the docstring headers that aren't really correct in the
# context of this new shared package structure
find "$target_directory" -name "*.py" -print0 | xargs -0 -I {} sh -c 'sed -i "" "/^\"\"\"/,/^\"\"\"/d" "{}"'

Expand Down Expand Up @@ -166,4 +166,4 @@ done
extract_shared_classes

# Format generated files
poetry run black "${destination}"
poetry run ruff format "${destination}"
2 changes: 1 addition & 1 deletion pinecone/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
from .config import ConfigBuilder, Config
from .pinecone_config import PineconeConfig

if os.getenv("PINECONE_DEBUG") != None:
if os.getenv("PINECONE_DEBUG") is not None:
logging.basicConfig(level=logging.DEBUG)
14 changes: 4 additions & 10 deletions pinecone/config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,7 @@

from pinecone.exceptions.exceptions import PineconeConfigurationError
from pinecone.config.openapi import OpenApiConfigFactory
from pinecone.core.openapi.shared.configuration import (
Configuration as OpenApiConfiguration,
)
from pinecone.core.openapi.shared.configuration import Configuration as OpenApiConfiguration
from pinecone.utils import normalize_host
from pinecone.utils.constants import SOURCE_TAG

Expand Down Expand Up @@ -72,15 +70,11 @@ def build(

@staticmethod
def build_openapi_config(
config: Config,
openapi_config: Optional[OpenApiConfiguration] = None,
**kwargs,
config: Config, openapi_config: Optional[OpenApiConfiguration] = None, **kwargs
) -> OpenApiConfiguration:
if openapi_config:
openapi_config = OpenApiConfigFactory.copy(
openapi_config=openapi_config,
api_key=config.api_key,
host=config.host,
openapi_config=openapi_config, api_key=config.api_key, host=config.host
)
elif openapi_config is None:
openapi_config = OpenApiConfigFactory.build(api_key=config.api_key, host=config.host)
Expand All @@ -95,7 +89,7 @@ def build_openapi_config(
openapi_config.proxy_headers = config.proxy_headers
if config.ssl_ca_certs:
openapi_config.ssl_ca_cert = config.ssl_ca_certs
if config.ssl_verify != None:
if config.ssl_verify is not None:
openapi_config.verify_ssl = config.ssl_verify

return openapi_config
16 changes: 5 additions & 11 deletions pinecone/config/openapi.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,7 @@

from urllib3.connection import HTTPConnection

from pinecone.core.openapi.shared.configuration import (
Configuration as OpenApiConfiguration,
)
from pinecone.core.openapi.shared.configuration import Configuration as OpenApiConfiguration

TCP_KEEPINTVL = 60 # Sec
TCP_KEEPIDLE = 300 # Sec
Expand All @@ -29,7 +27,9 @@ def build(cls, api_key: str, host: Optional[str] = None, **kwargs):
return openapi_config

@classmethod
def copy(cls, openapi_config: OpenApiConfiguration, api_key: str, host: str) -> OpenApiConfiguration:
def copy(
cls, openapi_config: OpenApiConfiguration, api_key: str, host: str
) -> OpenApiConfiguration:
"""
Copy a user-supplied openapi configuration and update it with the user's api key and host.
If they have not specified other socket configuration, we will use the default values.
Expand Down Expand Up @@ -88,13 +88,7 @@ def _get_socket_options(
and hasattr(socket, "TCP_KEEPCNT")
):
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, keep_alive_idle_sec)]
socket_params += [
(
socket.IPPROTO_TCP,
socket.TCP_KEEPINTVL,
keep_alive_interval_sec,
)
]
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, keep_alive_interval_sec)]
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, keep_alive_tries)]

# TCP Keep Alive Probes for Windows OS
Expand Down
12 changes: 7 additions & 5 deletions pinecone/config/pinecone_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,12 @@ def build(
additional_headers: Optional[Dict[str, str]] = {},
**kwargs,
) -> Config:
host = host or kwargs.get("host") or os.getenv("PINECONE_CONTROLLER_HOST") or DEFAULT_CONTROLLER_HOST
host = (
host
or kwargs.get("host")
or os.getenv("PINECONE_CONTROLLER_HOST")
or DEFAULT_CONTROLLER_HOST
)
headers_json = os.getenv("PINECONE_ADDITIONAL_HEADERS")
if headers_json:
try:
Expand All @@ -27,8 +32,5 @@ def build(
logger.warn(f"Ignoring PINECONE_ADDITIONAL_HEADERS: {e}")

return ConfigBuilder.build(
api_key=api_key,
host=host,
additional_headers=additional_headers,
**kwargs,
api_key=api_key, host=host, additional_headers=additional_headers, **kwargs
)
Loading

0 comments on commit 587a5d5

Please sign in to comment.