diff --git a/vector-inference/aws-installation.mdx b/vector-inference/aws-installation.mdx
index 11dd9e2..c67c319 100644
--- a/vector-inference/aws-installation.mdx
+++ b/vector-inference/aws-installation.mdx
@@ -13,6 +13,7 @@ mode: wide
- `aws` >= 2.15 ([aws installation guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html))
- `kubectl` >= 1.28 ([kubectl installation guide](https://kubernetes.io/docs/tasks/tools/#kubectl))
- `helm` >= 3.14 ([helm installation guide](https://helm.sh/docs/intro/install/#helm))
+- A Trieve Vector Inference License
You need to have an IAM policy that allows to use the `eksctl` CLI.
@@ -22,8 +23,6 @@ mode: wide
You are able to use the root account. However, AWS does not recommend doing this.
-You'll also need a license to run TVI.
-
### Getting your license
Contact us:
@@ -52,8 +51,6 @@ Check quota [here](https://us-east-2.console.aws.amazon.com/servicequotas/home/s
### Setting up environment variables
-Create EKS cluster and install needed plugins
-
Your AWS Account ID:
```sh
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query "Account" --output text)"
@@ -61,8 +58,6 @@ export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query "Account" --output
Your AWS Region:
- TVI supports all regions that have the `GPU_INSTANCE` that are chosen
-
```sh
export AWS_REGION=us-east-2
```
@@ -99,8 +94,15 @@ export GPU_COUNT=1
export AWS_PAGER=""
```
+ TVI supports all regions that have the `GPU_INSTANCE` that are chosen
+
+
### Create your cluster
+Create EKS cluster and install needed plugins
+
+The `bootstrap-eks.sh` script will create the EKS cluster, install the AWS Load Balancer Controller, and install the NVIDIA Device Plugin. This will also manage any IAM permissions that are needed for the plugins to work.
+
Download the `bootstrap-eks.sh` script
```sh
wget cdn.trieve.ai/bootstrap-eks.sh
@@ -165,8 +167,11 @@ models:
### Install the helm chart
- This helm chart will only work if you subscribe to the AWS Marketplace Listing
+ This helm chart will only work if you subscribe to the AWS Marketplace Listing.
+
+ Contact us at humans@trieve.ai if you do not have access to the AWS Marketplace or cannot use AWS marketplace.
+
@@ -205,6 +210,27 @@ vector-inference-embedding-spladequery-ingress alb * k8s-default-ve
The `Address` field is the endpoint that you can make [dense embeddings](/vector-inference/embed), [sparse embeddings](/vector-inference/embed_sparse), or [reranker calls](/vector-inference/reranker) based on the models you chose.
+## To ensure everything is working, make a request to the model endpoint provided.
+
+```sh
+# Replace the endpoint with the one you got from the previous step
+export ENDPOINT=k8s-default-vectorin-18b7ade77a-2040086997.us-east-2.elb.amazonaws.com
+
+curl -X POST \
+ -H "Content-Type: application/json"\
+ -d '{"inputs": "test input"}' \
+ --url "http://$ENDPOINT/embed" \
+ -w "\n\nInfernce Took%{time_total} seconds\!\n"
+```
+
+The output should look like something like this
+
+```sh
+# The vector
+[[ 0.038483415, -0.00076982786, -0.020039458 ... ], [ 0.04496114, -0.039057795, -0.022400795, ... ]]
+Inference only Took 0.067066 seconds!
+```
+
## Using Trieve Vector Inference
Each `ingress` point will be using their own Application Load Balancer within AWS. The `Address` provided is the model's endpoint that you can make [dense embeddings](/vector-inference/embed), [sparse embeddings](/vector-inference/embed_sparse), or [reranker calls](/vector-inference/reranker) based on the models you chose.