From 2eede8992cb1fecdc151e6250439ddfa57f25e41 Mon Sep 17 00:00:00 2001
From: Future-Outlier <eric901201@gmail.com>
Date: Sat, 6 Apr 2024 07:19:51 +0800
Subject: [PATCH] [Docs] Testing agents in the development environment (#5106)

* Testing agents in the development environment

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* nit

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* nit

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* rename

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* blank

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* rerun build docs ci

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update pingsu's advice

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Co-authored-by: Kevin Su <pingsutw@gmail.com>

* Update pingsu's advice

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Co-authored-by: Kevin Su <pingsutw@gmail.com>

* deploying agents in the sandbox

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* rename

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* nit

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Implementing Agent Metadata Service

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* reorganize and copyedit new content

Signed-off-by: nikki everett <nikki@union.ai>

---------

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: nikki everett <nikki@union.ai>
Co-authored-by: Kevin Su <pingsutw@gmail.com>
Co-authored-by: nikki everett <nikki@union.ai>
---
 .../deploying_agents_to_the_flyte_sandbox.md  | 98 +++++++++++++++++++
 docs/flyte_agents/developing_agents.md        | 13 ++-
 ...nabling_agents_in_your_flyte_deployment.md |  2 +-
 ...implementing_the_agent_metadata_service.md | 48 +++++++++
 docs/flyte_agents/index.md                    | 30 +++---
 ...g_agents_in_a_local_development_cluster.md | 94 ++++++++++++++++++
 ...g_agents_in_a_local_python_environment.md} |  4 +-
 7 files changed, 267 insertions(+), 22 deletions(-)
 create mode 100644 docs/flyte_agents/deploying_agents_to_the_flyte_sandbox.md
 create mode 100644 docs/flyte_agents/implementing_the_agent_metadata_service.md
 create mode 100644 docs/flyte_agents/testing_agents_in_a_local_development_cluster.md
 rename docs/flyte_agents/{testing_agents_locally.md => testing_agents_in_a_local_python_environment.md} (96%)
diff --git a/docs/flyte_agents/deploying_agents_to_the_flyte_sandbox.md b/docs/flyte_agents/deploying_agents_to_the_flyte_sandbox.md
new file mode 100644
index 0000000000..c4f1a2881e
--- /dev/null
+++ b/docs/flyte_agents/deploying_agents_to_the_flyte_sandbox.md
@@ -0,0 +1,98 @@
+---
+jupytext:
+  formats: md:myst
+  text_representation:
+    extension: .md
+    format_name: myst
+---
+
+(deploying_agents_to_the_flyte_sandbox)=
+# Deploying agents to the Flyte sandbox
+
+After you have finished {ref}`testing an agent locally <testing_agents_locally>`, you can deploy your agent to the Flyte sandbox.
+
+Here's a step by step guide to deploying your agent image to the Flyte sandbox.
+
+1. Start the Flyte sandbox:
+```bash
+flytectl demo start
+```
+
+2. Build an agent image:
+You can go to [here](https://github.com/flyteorg/flytekit/blob/master/Dockerfile.agent) to see the Dockerfile we use in flytekit python.
+Take Databricks agent as an example:
+```Dockerfile
+FROM python:3.9-slim-bookworm
+
+RUN apt-get update && apt-get install build-essential git -y
+RUN pip install prometheus-client grpcio-health-checking
+RUN pip install --no-cache-dir -U flytekit \
+    git+https://github.com/flyteorg/flytekit.git@<gitsha>#subdirectory=plugins/flytekit-spark \
+    && apt-get clean autoclean \
+    && apt-get autoremove --yes \
+    && rm -rf /var/lib/{apt,dpkg,cache,log}/ \
+    && :
+
+CMD pyflyte serve agent --port 8000
+```
+```bash
+docker buildx build -t localhost:30000/flyteagent:example -f Dockerfile.agent . --load
+docker push localhost:30000/flyteagent:example
+```
+
+2. Deploy your agent image to the Kubernetes cluster:
+```bash
+kubectl edit deployment flyteagent -n flyte
+```
+Search for the `image` key and change its value to your agent image:
+```yaml
+image: localhost:30000/flyteagent:example
+```
+
+3. Set up your secrets:
+Let's take Databricks agent as an example:
+```bash
+kubectl edit secret flyteagent -n flyte
+```
+Get your `BASE64_ENCODED_DATABRICKS_TOKEN`:
+```bash
+echo -n "<DATABRICKS_TOKEN>" | base64
+```
+Add your token to the `data` field:
+```yaml
+apiVersion: v1
+data:
+  flyte_databricks_access_token: <BASE64_ENCODED_DATABRICKS_TOKEN>
+kind: Secret
+metadata:
+  annotations:
+    meta.helm.sh/release-name: flyteagent
+    meta.helm.sh/release-namespace: flyte
+  creationTimestamp: "2023-10-04T04:09:03Z"
+  labels:
+    app.kubernetes.io/managed-by: Helm
+  name: flyteagent
+  namespace: flyte
+  resourceVersion: "753"
+  uid: 5ac1e1b6-2a4c-4e26-9001-d4ba72c39e54
+type: Opaque
+```
+:::{note}
+Please ensure two things:
+1. The secret name consists only of lowercase English letters.
+2. The secret value is encoded in Base64.
+:::
+
+4. Restart development:
+```bash
+kubectl rollout restart deployment flyte-sandbox -n flyte
+```
+
+5. Test your agent remotely in the Flyte sandbox:
+```bash
+pyflyte run --remote agent_workflow.py agent_task
+```
+
+:::{note}
+You must build an image that includes the plugin for the task and specify its config with the [`--image` flag](https://docs.flyte.org/en/latest/api/flytekit/pyflyte.html#cmdoption-pyflyte-run-i) when running `pyflyte run` or in an {ref}`ImageSpec <imagespec>` definition in your workflow file.
+:::
\ No newline at end of file
diff --git a/docs/flyte_agents/developing_agents.md b/docs/flyte_agents/developing_agents.md
index f4fb1378b4..9df49c2f8b 100644
--- a/docs/flyte_agents/developing_agents.md
+++ b/docs/flyte_agents/developing_agents.md
@@ -131,9 +131,9 @@ class FileSensor(BaseSensor):
 ```
 
 
-### 2. Test the agent locally
+### 2. Test the agent
 
-See {doc}`"Testing agents locally" <testing_agents_locally>` to test your agent locally.
+You can test your agent in a {ref}`local Python environment <testing_agents_locally>` or in a {ref}<local development cluster `testing_agents_in_a_local_development_cluster`>.
 
 ### 3. Build a new Docker image
 
@@ -166,7 +166,7 @@ For flytekit versions `>v1.10.2`, use `pyflyte serve agent`.
 kubectl set image deployment/flyteagent flyteagent=ghcr.io/flyteorg/flyteagent:latest
 ```
 
-2. Update the FlytePropeller configmap.
+2. Update the FlytePropeller configmap:
 
 ```YAML
  tasks:
@@ -178,14 +178,17 @@ kubectl set image deployment/flyteagent flyteagent=ghcr.io/flyteorg/flyteagent:l
        - custom_task: agent-service
 ```
 
-3. Restart FlytePropeller.
+3. Restart FlytePropeller:
 
 ```
 kubectl rollout restart deployment flytepropeller -n flyte
 ```
 
+### 5. 
+
 
 ### Canary deployment
+
 Agents can be deployed independently in separate environments. Decoupling agents from the
 production environment ensures that if any specific agent encounters an error or issue, it will not impact the overall production system.
 
@@ -193,7 +196,7 @@ By running agents independently, you can thoroughly test and validate your agent
 controlled environment before deploying them to the production cluster.
 
 By default, all agent requests will be sent to the default agent service. However,
-you can route particular task requests to designated agent services by adjusting the flytepropeller configuration. 
+you can route particular task requests to designated agent services by adjusting the FlytePropeller configuration. 
 
 ```yaml
  plugins:
diff --git a/docs/flyte_agents/enabling_agents_in_your_flyte_deployment.md b/docs/flyte_agents/enabling_agents_in_your_flyte_deployment.md
index f50b740a21..add7c2598c 100644
--- a/docs/flyte_agents/enabling_agents_in_your_flyte_deployment.md
+++ b/docs/flyte_agents/enabling_agents_in_your_flyte_deployment.md
@@ -6,7 +6,7 @@ jupytext:
     format_name: myst
 ---
 
-(enabling_agents_in_your_flyte_deploymen)=
+(enabling_agents_in_your_flyte_deployment)=
 # Enabling agents in your Flyte deployment
 
 After you have finished {ref}`testing an agent locally <testing_agents_locally>`, you can enable the agent in your Flyte deployment to use it in production. To enable a particular agent in your Flyte deployment, see the [Agent setup guide](https://docs.flyte.org/en/latest/deployment/agents/index.html) for the agent.
diff --git a/docs/flyte_agents/implementing_the_agent_metadata_service.md b/docs/flyte_agents/implementing_the_agent_metadata_service.md
new file mode 100644
index 0000000000..6a1956b774
--- /dev/null
+++ b/docs/flyte_agents/implementing_the_agent_metadata_service.md
@@ -0,0 +1,48 @@
+---
+jupytext:
+  formats: md:myst
+  text_representation:
+    extension: .md
+    format_name: myst
+---
+
+(implementing_the_agent_metadata_service)=
+# Implementing the agent metadata service
+
+## About the agent metadata service
+
+Before FlytePropeller sends a request to the agent server, it needs to know four things:
+
+- The name of the agent
+- Which task category the agent supports
+- The version of the task category
+- Whether the agent executes tasks synchronously or asynchronously
+
+After FlytePropeller obtains this metadata, it can send a request to the agent deployment using the correct gRPC method.
+
+:::{note}
+- An agent can support multiple task categories.
+- We will use the combination of [task category][version] to identify the specific agent's deployment and know whether the task is synchronous or asynchronous in FlytePropeller.
+- The task category is `task_type` in flytekit.
+:::
+
+Using the BigQuery Agent as an example:
+- The agent's name is `BigQuery Agent`.
+- The agent supports `bigquery_query_job_task`.
+- The agent's version is `0`.
+- By default, the agent executes tasks asynchronously.
+
+## Implement the agent metadata service
+
+To implement the agent metadata service, you must do two things:
+
+1. Implement the agent metadata service.
+2. Add the agent metadata service to the agent server.
+
+You can refer to [base_agent.py](https://github.com/flyteorg/flytekit/blob/master/flytekit/extend/backend/base_agent.py), [agent_service.py](https://github.com/flyteorg/flytekit/blob/master/flytekit/extend/backend/agent_service.py), and [serve.py](https://github.com/flyteorg/flytekit/blob/master/flytekit/clis/sdk_in_container/serve.py) to see how the agent metadata service is implemented in flytekit's agent server.
+
+Those gRPC methods are generated by [flyteidl](https://github.com/flyteorg/flyte/blob/master/flyteidl/protos/flyteidl/service/agent.proto) and you can import them from [here](https://github.com/flyteorg/flyte/tree/master/flyteidl/gen).
+
+:::{note}
+You can search the keyword `metadata` to find implementations in those files.
+:::
diff --git a/docs/flyte_agents/index.md b/docs/flyte_agents/index.md
index d5813650cc..e7d627a670 100644
--- a/docs/flyte_agents/index.md
+++ b/docs/flyte_agents/index.md
@@ -18,32 +18,34 @@ You can create different agent services that host different agents, e.g., a prod
 :class: with-shadow
 :::
 
-(using_agents_in_tasks)=
-## Using agents in tasks
-
-If you need to connect to an external service in your workflow, we recommend using the corresponding agent rather than a web API plugin. Agents are designed to be scalable and can handle large workloads efficiently, and decrease load on FlytePropeller, since they run outside of it. You can also test agents locally without having to change the Flyte backend configuration, streamlining development.
-
-For a list of agents you can use in your tasks and example usage for each, see the [Integrations](https://docs.flyte.org/en/latest/flytesnacks/integrations.html#flyte-agents) documentation.
-
 ## Table of contents
 
 ```{list-table}
 :header-rows: 0
 :widths: 20 30
 
-* - {doc}`Developing agents <developing_agents>`
-  - If the agent you need doesn't exist, follow these steps to create it.
-* - {doc}`Testing agents locally <testing_agents_locally>`
-  - Whether using an existing agent or developing a new one, you can test the agent locally without needing to configure your Flyte deployment.
+* - {doc}`Testing agents locally <testing_agents_in_a_local_python_environment>`
+  - Whether using an {ref}`existing agent <flyte_agents>` or developing a new one, you can quickly test the agent in local Python environment without needing to configure your Flyte deployment.
 * - {doc}`Enabling agents in your Flyte deployment <enabling_agents_in_your_flyte_deployment>`
-  - Once you have tested an agent locally and want to use it in production, you must configure your Flyte deployment for the agent.
+  - After you have tested an {ref}`existing agent <flyte_agents>` in a local Python environment, you must configure your Flyte deployment for the agent to use it in production.
+* - {doc}`Developing agents <developing_agents>`
+  - If the agent you need doesn't exist, follow these steps to create a new agent.
+* - {doc}`Testing agents in a local development cluster <testing_agents_in_a_local_development_cluster>`
+  - After developing your new agent and testing it in a local Python environment, you can test it in a local development cluster to ensure it works well remotely.
+* - {doc}`Deploying agents to the Flyte sandbox <deploying_agents_to_the_flyte_sandbox>`
+  - Once you have tested your new agent in a local development cluster and want to use it in production, you should test it in the Flyte sandbox.
+* - {doc}`Implementing the agent metadata service <implementing_the_agent_metadata_service>`
+  - If you want to develop an agent server in a language other than Python (e.g., Rust or Java), you must implement the agent metadata service in your agent server.
 ```
 
 ```{toctree}
 :maxdepth: -1
 :hidden:
 
-developing_agents
-testing_agents_locally
+testing_agents_in_a_local_python_environment
 enabling_agents_in_your_flyte_deployment
+developing_agents
+testing_agents_in_a_local_development_cluster
+deploying_agents_to_the_flyte_sandbox
+implementing_the_agent_metadata_service
 ```
diff --git a/docs/flyte_agents/testing_agents_in_a_local_development_cluster.md b/docs/flyte_agents/testing_agents_in_a_local_development_cluster.md
new file mode 100644
index 0000000000..be2c14210d
--- /dev/null
+++ b/docs/flyte_agents/testing_agents_in_a_local_development_cluster.md
@@ -0,0 +1,94 @@
+---
+jupytext:
+  formats: md:myst
+  text_representation:
+    extension: .md
+    format_name: myst
+---
+
+(testing_agents_in_a_local_development_cluster)=
+# Testing agents in a local development cluster
+
+The Flyte agent service runs in a separate deployment instead of inside FlytePropeller. To test an agent server in a local development cluster, you must run both the single binary and agent server at the same time, allowing FlytePropeller to send requests to your local agent server.
+
+## Backend plugin vs agent service architecture
+
+To understand why you must run both the single binary and agent server at the same time, it is helpful to compare the backend plugin architecture to the agent service architecture.
+
+### Backend plugin architecture
+
+In this architecture, FlytePropeller sends requests through the SDK:
+
+![image.png](https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/agents/plugin_life_cycle.png)
+
+### Agent service architecture
+
+With the agent service framework:
+1. Flyteplugins send gRPC requests to the agent server.
+2. The agent server sends requests through the SDK and returns the query data.
+
+![image.png](https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/agents/async_agent_life_cycle.png)
+
+## Configuring the agent service in development mode
+
+1. Start the demo cluster in dev mode:
+```bash
+flytectl demo start --dev
+```
+
+2. Start the agent grpc server:
+```bash
+pyflyte serve agent
+```
+
+3. Update the config for the task handled by the agent in the single binary yaml file.
+```bash
+cd flyte
+vim ./flyte-single-binary-local.yaml
+```
+
+```yaml
+:emphasize-lines: 9
+tasks:
+  task-plugins:
+    enabled-plugins:
+      - agent-service
+      - container
+      - sidecar
+      - K8S-ARRAY
+    default-for-task-types:
+      - bigquery_query_job_task: agent-service
+      - container: container
+      - container_array: K8S-ARRAY
+```
+```yaml
+plugins:
+  # Registered Task Types
+  agent-service:
+    defaultAgent:
+      endpoint: "localhost:8000" # your grpc agent server port
+      insecure: true
+      timeouts:
+        GetTask: 10s
+      defaultTimeout: 10s
+```
+
+4. Start the Flyte server with the single binary config file:
+```bash
+make compile
+./flyte start --config ./flyte-single-binary-local.yaml
+```
+
+5. Set up your secrets:
+In the development environment, you can set up your secrets on your local machine by adding secrets to `/etc/secrets/SECRET_NAME`. 
+
+Since your agent server is running locally rather than within Kubernetes, it can retrieve the secret from your local file system.
+
+6. Test your agent task:
+```bash
+pyflyte run --remote agent_workflow.py agent_task
+```
+
+:::{note}
+You must build an image that includes the plugin for the task and specify its config with the [`--image` flag](https://docs.flyte.org/en/latest/api/flytekit/pyflyte.html#cmdoption-pyflyte-run-i) when running `pyflyte run` or in an {ref}`ImageSpec <imagespec>` definition in your workflow file.
+:::
diff --git a/docs/flyte_agents/testing_agents_locally.md b/docs/flyte_agents/testing_agents_in_a_local_python_environment.md
similarity index 96%
rename from docs/flyte_agents/testing_agents_locally.md
rename to docs/flyte_agents/testing_agents_in_a_local_python_environment.md
index dd4294dbea..ef26642d6b 100644
--- a/docs/flyte_agents/testing_agents_locally.md
+++ b/docs/flyte_agents/testing_agents_in_a_local_python_environment.md
@@ -7,9 +7,9 @@ jupytext:
 ---
 
 (testing_agents_locally)=
-# Testing agents locally
+# Testing agents in a local Python environment
 
-You can test agents locally without running the backend server, making agent development easier.
+You can test agents locally without running the backend server.
 
 To test an agent locally, create a class for the agent task that inherits from `SyncAgentExecutorMixin` or `AsyncAgentExecutorMixin`.
 These mixins can handle synchronous and asynchronous tasks, respectively, and allow flytekit to mimic FlytePropeller's behavior in calling the agent.