Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge release-candidate/2024-10 branch to main #85

Merged
merged 7 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
name: ci
on:
pull_request:
branches:
- main
pull_request: {}

jobs:
build:
Expand Down Expand Up @@ -36,7 +34,6 @@ jobs:
run: |
go get ./pinecone
- name: Run tests
continue-on-error: true
run: go test -count=1 -v ./pinecone
env:
PINECONE_API_KEY: ${{ secrets.API_KEY }}
Expand Down
132 changes: 128 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -579,6 +579,70 @@ func main() {
}
```

### Import vectors from object storage

You can now [import vectors en masse](https://docs.pinecone.io/guides/data/understanding-imports) from object
storage. `Import` is a long-running, asynchronous operation that imports large numbers of records into a Pinecone
serverless index.

In order to import vectors from object storage, they must be stored in Parquet files and adhere to the necessary
[file format](https://docs.pinecone.io/guides/data/understanding-imports#parquet-file-format). Your object storage
must also adhere to the necessary [directory structure](https://docs.pinecone.io/guides/data/understanding-imports#directory-structure).

The following example imports vectors from an Amazon S3 bucket into a Pinecone serverless index:

```go
ctx := context.Background()

clientParams := pinecone.NewClientParams{
ApiKey: os.Getenv("PINECONE_API_KEY"),
}

pc, err := pinecone.NewClient(clientParams)

if err != nil {
log.Fatalf("Failed to create Client: %v", err)
}

indexName := "sample-index"

idx, err := pc.CreateServerlessIndex(ctx, &pinecone.CreateServerlessIndexRequest{
Name: indexName,
Dimension: 3,
Metric: pinecone.Cosine,
Cloud: pinecone.Aws,
Region: "us-east-1",
})

if err != nil {
log.Fatalf("Failed to create serverless index: %v", err)
}

idx, err = pc.DescribeIndex(ctx, "pinecone-index")

if err != nil {
log.Fatalf("Failed to describe index \"%v\": %v", idx.Name, err)
}

idxConnection, err := pc.Index(pinecone.NewIndexConnParams{Host: idx.Host})
if err != nil {
log.Fatalf("Failed to create IndexConnection for Host: %v: %v", idx.Host, err)
}

storageURI := "s3://my-bucket/my-directory/"

errorMode := "abort" // Will abort if error encountered; other option: "continue"

importRes, err := idxConnection.StartImport(ctx, storageURI, nil, (*pinecone.ImportErrorMode)(&errorMode))

if err != nil {
log.Fatalf("Failed to start import: %v", err)
}

fmt.Printf("import started with ID: %s", importRes.Id)
```
You can [start, cancel, and check the status](https://docs.pinecone.io/guides/data/import-data) of all or one import operation(s).

### Query an index

#### Query by vector values
Expand Down Expand Up @@ -1307,13 +1371,17 @@ func main() {

## Inference

The `Client` object has an `Inference` namespace which allows interacting with Pinecone's [Inference API](https://docs.pinecone.io/reference/api/2024-07/inference/generate-embeddings). The Inference API is a service that gives you access to embedding models hosted on Pinecone's infrastructure. Read more at [Understanding Pinecone Inference](https://docs.pinecone.io/guides/inference/understanding-inference).
The `Client` object has an `Inference` namespace which allows interacting with
Pinecone's [Inference API](https://docs.pinecone.io/reference/api/2024-07/inference/generate-embeddings). The Inference
API is a service that gives you access to embedding models hosted on Pinecone's infrastructure. Read more
at [Understanding Pinecone Inference](https://docs.pinecone.io/guides/inference/understanding-inference).

**Notes:**

Models currently supported:

- [multilingual-e5-large](https://docs.pinecone.io/guides/inference/understanding-inference#embedding-models)
- Embedding: [multilingual-e5-large](https://docs.pinecone.io/guides/inference/understanding-inference#embedding-models)
- Reranking: [bge-reranker-v2-m3](https://docs.pinecone.io/models/bge-reranker-v2-m3)

### Create Embeddings

Expand Down Expand Up @@ -1368,11 +1436,67 @@ Send text to Pinecone's inference API to generate embeddings for documents and q
}
fmt.Printf("query embedding response: %+v", queryEmbeddingsResponse)

// << Send query to Pinecone to retrieve similar documents >>
// << Send query to Pinecone to retrieve similar documents >>
```

### Rerank documents

Rerank documents in descending relevance-order against a query.

**Note:** The `score` represents the absolute measure of relevance of a given query and passage pair. Normalized
between [0, 1], the `score` represents how closely relevant a specific item and query are, with scores closer to 1
indicating higher relevance.

```go
ctx := context.Background()

pc, err := pinecone.NewClient(pinecone.NewClientParams{
ApiKey: "YOUR-API-KEY"
})

if err != nil {
log.Fatalf("Failed to create Client: %v", err)
}

rerankModel := "bge-reranker-v2-m3"
query := "What are some good Turkey dishes for Thanksgiving?"

documents := []pinecone.Document{
{"title": "Turkey Sandwiches", "body": "Turkey is a classic meat to eat at American Thanksgiving."},
{"title": "Lemon Turkey", "body": "A lemon brined Turkey with apple sausage stuffing is a classic Thanksgiving main course."},
{"title": "Thanksgiving", "body": "My favorite Thanksgiving dish is pumpkin pie"},
{"title": "Protein Sources", "body": "Turkey is a great source of protein."},
}

// Optional arguments
topN := 3
returnDocuments := false
rankFields := []string{"body"}
modelParams := map[string]string{
"truncate": "END",
}

rerankRequest := pinecone.RerankRequest{
Model: rerankModel,
Query: query,
Documents: documents,
TopN: &topN,
ReturnDocuments: &returnDocuments,
RankFields: &rankFields,
Parameters: &modelParams,
}

rerankResponse, err := pc.Inference.Rerank(ctx, &rerankRequest)

if err != nil {
log.Fatalf("Failed to rerank documents: %v", err)
}

fmt.Printf("rerank response: %+v", rerankResponse)
```

## Support

To get help using go-pinecone you can file an issue on [GitHub](https://github.com/pinecone-io/go-pinecone/issues), visit the [community forum](https://community.pinecone.io/),
To get help using go-pinecone you can file an issue on [GitHub](https://github.com/pinecone-io/go-pinecone/issues),
visit the [community forum](https://community.pinecone.io/),
or reach out to [email protected].
2 changes: 1 addition & 1 deletion codegen/apis
Submodule apis updated from 062b11 to 39e90e
78 changes: 59 additions & 19 deletions codegen/build-clients.sh
Original file line number Diff line number Diff line change
@@ -1,15 +1,30 @@
#!/bin/bash

set -eux -o pipefail

version=$1 # e.g. 2024-07

# data_destination must align with the option go_package:
# https://github.com/pinecone-io/apis/blob/e9b47c76f649656002f4911946ca6c4c4a6f04fc/src/release/data/data.proto#L3
data_destination="internal/gen/data"
control_destination="internal/gen/control"
# modules
db_control_module="db_control"
db_data_module="db_data"
inference_module="inference"

# generated grpc output destination paths
# db_data_destination must align with the option go_package in the proto file:
# https://github.com/pinecone-io/apis/blob/d1d005e75cc9fe9a5c486ef9218fe87b57765961/src/release/db/data/data.proto#L3
db_data_destination="internal/gen/${db_data_module}"
db_control_destination="internal/gen/${db_control_module}"
inference_destination="internal/gen/${inference_module}"

# version file
version_file="internal/gen/api_version.go"

# generated oas file destination paths
db_data_rest_destination="${db_data_destination}/rest"
db_data_oas_file="${db_data_rest_destination}/${db_data_module}_${version}.oas.go"
db_control_oas_file="${db_control_destination}/${db_control_module}_${version}.oas.go"
inference_oas_file="${inference_destination}/${inference_module}_${version}.oas.go"

set -eux -o pipefail

update_apis_repo() {
echo "Updating apis repo"
pushd codegen/apis
Expand All @@ -27,18 +42,35 @@ verify_spec_version() {
echo "Version is required"
exit 1
fi

verify_directory_exists "codegen/apis/_build/${version}"
}

verify_directory_exists() {
local directory=$1
if [ ! -d "$directory" ]; then
echo "Directory does not exist at $directory"
exit 1
fi
}

generate_oas_client() {
oas_file="codegen/apis/_build/${version}/control_${version}.oas.yaml"
local module=$1
local destination=$2

# source oas file for module and version
oas_file="codegen/apis/_build/${version}/${module}_${version}.oas.yaml"

oapi-codegen --package=control \
oapi-codegen --package=${module} \
--generate types,client \
"${oas_file}" > "${control_destination}/control_plane.oas.go"
"${oas_file}" > "${destination}"
}

generate_proto_client() {
proto_file="codegen/apis/_build/${version}/data_${version}.proto"
local module=$1

# source proto file for module and version
proto_file="codegen/apis/_build/${version}/${module}_${version}.proto"

protoc --experimental_allow_proto3_optional \
--proto_path=codegen/apis/vendor/protos \
Expand All @@ -63,19 +95,27 @@ EOL
update_apis_repo
verify_spec_version $version

# Generate control plane client code
rm -rf "${control_destination}"
mkdir -p "${control_destination}"
# Clear internal/gen/* contents
rm -rf internal/gen/*

# Generate db_control oas client
rm -rf "${db_control_destination}"
mkdir -p "${db_control_destination}"
generate_oas_client $db_control_module $db_control_oas_file

generate_oas_client
# Generate inference oas client
rm -rf "${inference_destination}"
mkdir -p "${inference_destination}"
generate_oas_client $inference_module $inference_oas_file

# Generate data plane client code
rm -rf "${data_destination}"
mkdir -p "${data_destination}"
# Generate db_data oas and proto clients
rm -rf "${db_data_destination}"
mkdir -p "${db_data_destination}"
mkdir -p "${db_data_rest_destination}"

generate_proto_client
generate_oas_client $db_data_module $db_data_oas_file
generate_proto_client $db_data_module

# Generate version file
rm -rf "${version_file}"

generate_version_file
2 changes: 1 addition & 1 deletion internal/gen/api_version.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading