Skip to content

Commit

Permalink
Config / init refactor for state encapsulation (#220)
Browse files Browse the repository at this point in the history
## Problem

Singleton config limits how people can use this SDK and makes testing
more difficult. Also, we no longer should fetch projectId and construct
data url out of configuration parts, to keep the client agnostic to the
format of the data plane url.

## Solution

- **State encapsulation using classes**
- We want to encapsulate all config-related state into a new class
`Config` and a subclass `PineconeConfig`.
- `Config` is a vessel for api_key, host, and openapi configuration. It
is host-agnostic because it is used by both the control plane
(`Pinecone`) and the data plane (`Index`)
- `PineconeConfig` is a convenience class to set the controller host for
`Pinecone`.
- New singleton store `IndexHostStore` means we won't make unnecessary
and repetitious calls to describe_index if a user creates multiple
`Index` instances referring to the same index. In the old paradigm, this
was accomplished with a piece of global state storing off projectId.
- **Modified class init for Index** 
- `Index` class constructor now requires host url instead of an index
name. If you use the helper method on a `Pinecone` class instance to
instantiate the object, it will manage fetching the host url for you.
E.g. `p = Pinecone(api_key='foo'); index = p.Index('my-index')`.

```python
# Old way
import pinecone
pinecone.init(api_key='foo')
index = pinecone.Index('my-index')

# New way
from pinecone import Pinecone
client = Pinecone(api_key='foo')
index = client.Index('my-index')

# New way (alternate, for data-plane only use case)
from pinecone import Index
index = Index(api_key='foo', host='https://data-url')
```

- **Moving things around**
- `manage.py` got moved and renamed to `pinecone/control/pinecone.py`.
The functions such as `create_index`, `describe_index`, etc that
previously were top-level package exports relying on global state to
access config information have been turned into class methods on a new
`Pinecone` class defined in this file.
- `pinecone/index.py` got moved to `pinecone/data/index.py`. The
`pinecone/data` package is a home for all classes used to access the
data plane.
    - Everything `grpc` related got shoved into `pinecone/data/grpc`.
- GRPC mess in one place only. Utils that were grpc-related got pulled
out of `pinecone/utils.py` and moved to `pinecone/data/grpc/utils.py`.
- Leftover util junk moved into a `deprecated` package. Probably will be
deleted soon.
- **Breaking down big files into small files**
- The file previously at `pinecone/index_grpc.py` to split up into many
smaller files inside `pinecone/data/grpc/` package.
- Remaining util functions in `pinecone/utils.py` got broken up into
many smaller files in `pinecone/utils/`
- **Fix existing tests**
- Needed to adjust a lot of import statements and sometimes setup
functions to get pre-existing tests running.

## Using it

```python
# We're going to use a new Pinecone class instead of interacting directly with the package 
# namespace as in the past
from pinecone import Pinecone

# No more init() method
p = Pinecone(api_key='foo')

# Target an index
index = p.Index('my-index')

# Use data plane methods
index.describe_index_stats()
```

**Mostly working**
- `describe_index` (call succeeds but need to follow up on
source_collection field, would ideally return different object for
different capacity_modes)

**Somewhat working**:
- `create_index` (API call succeeds, but client blows up when response
is not the expected shape)
- `delete_index` (only when timeout=-1, since otherwise it has
dependency on `list_indexes`)

**Broken**:
- `list_indexes` (response not expected shape)

## Type of Change

- [x] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update

## Test Plan

Automated tests will be expanded in a future PR. A bunch of code moved
around to different files or got split from big files into small files
and I want to merge this part of the change to unlock parallel work by
Austin various follow-ups.
  • Loading branch information
jhamon authored Oct 20, 2023
1 parent 17e0566 commit 2981ab4
Show file tree
Hide file tree
Showing 43 changed files with 1,162 additions and 1,316 deletions.
18 changes: 3 additions & 15 deletions pinecone/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,10 @@
"""
from .config import *
from .exceptions import *
from .manage import *
from .index import *
from .control.pinecone import Pinecone
from .data.index import *

try:
from .grpc.index_grpc import *
from .data.grpc.index_grpc import *
except ImportError:
pass # ignore for non-[grpc] installations

# Kept for backwards-compatibility
UpsertResult = None
"""@private"""
DeleteResult = None
"""@private"""
QueryResult = None
"""@private"""
FetchResult = None
"""@private"""
InfoResult = None
"""@private"""
260 changes: 0 additions & 260 deletions pinecone/config.py

This file was deleted.

2 changes: 2 additions & 0 deletions pinecone/config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from .config import Config
from .pinecone_config import PineconeConfig
69 changes: 69 additions & 0 deletions pinecone/config/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
from typing import NamedTuple
import os

from pinecone.utils import check_kwargs
from pinecone.exceptions import PineconeConfigurationError
from pinecone.core.client.exceptions import ApiKeyError
from pinecone.config.openapi import OpenApiConfigFactory
from pinecone.core.client.configuration import Configuration as OpenApiConfiguration


class ConfigBase(NamedTuple):
api_key: str = ""
host: str = ""
openapi_config: OpenApiConfiguration = None


class Config:
"""
Configurations are resolved in the following order:
- configs passed as keyword parameters
- configs specified in environment variables
- default values (if applicable)
"""

"""Initializes the Pinecone client.
:param api_key: Required if not set in config file or by environment variable ``PINECONE_API_KEY``.
:param host: Optional. Controller host.
:param openapi_config: Optional. Set OpenAPI client configuration.
"""

def __init__(
self,
api_key: str = None,
host: str = None,
openapi_config: OpenApiConfiguration = None,
**kwargs,
):
api_key = api_key or kwargs.pop("api_key", None) or os.getenv("PINECONE_API_KEY")
host = host or kwargs.pop("host", None)
openapi_config = (
openapi_config
or kwargs.pop("openapi_config", None)
or OpenApiConfigFactory.build(api_key=api_key, host=host)
)

check_kwargs(self.__init__, kwargs)
self._config = ConfigBase(api_key, host, openapi_config)
self.validate()

def validate(self):
if not self._config.api_key:
raise PineconeConfigurationError("You haven't specified an Api-Key.")
if not self._config.host:
raise PineconeConfigurationError("You haven't specified a host.")

@property
def API_KEY(self):
return self._config.api_key

@property
def HOST(self):
return self._config.host

@property
def OPENAPI_CONFIG(self):
return self._config.openapi_config
8 changes: 8 additions & 0 deletions pinecone/config/logging.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import logging

PARENT_LOGGER_NAME = "pinecone"
DEFAULT_PARENT_LOGGER_LEVEL = "ERROR"

_logger = logging.getLogger(__name__)
_parent_logger = logging.getLogger(PARENT_LOGGER_NAME)
_parent_logger.setLevel(DEFAULT_PARENT_LOGGER_LEVEL)
Loading

0 comments on commit 2981ab4

Please sign in to comment.