Minor fixes to the documentation (#104)

autogluon · Jan 17, 2024 · 2a8f757 · 2a8f757
1 parent 8f2dc63
commit 2a8f757
Show file tree

Hide file tree

Showing 2 changed files with 78 additions and 25 deletions.
diff --git a/docs/index.md b/docs/index.md
@@ -54,11 +54,11 @@ test_data = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/Inc/test.cs
 test_data.drop(columns=["class"], inplace=True)
 predictor_init_args = {
     "label": "class"
-}  # init args you would pass to AG TabularPredictor
+}  # args used when creating TabularPredictor()
 predictor_fit_args = {
     "train_data": train_data,
     "time_limit": 120
-}  # fit args you would pass to AG TabularPredictor
+}  # args passed to TabularPredictor.fit()
 cloud_predictor = TabularCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
 cloud_predictor.fit(
     predictor_init_args=predictor_init_args, predictor_fit_args=predictor_fit_args
@@ -85,10 +85,10 @@ test_data = pd.read_parquet("https://autogluon-text.s3-accelerate.amazonaws.com/
 test_data.drop(columns=["label"], inplace=True)
 predictor_init_args = {
     "label": "label"
-}  # init args you would pass to AG MultiModalPredictor
+}  # args used when creating MultiModalPredictor()
 predictor_fit_args = {
     "train_data": train_data
-}  # fit args you would pass to AG MultiModalPredictor
+}  # args passed to MultiModalPredictor.fit()
 cloud_predictor = MultiModalCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
 cloud_predictor.fit(
     predictor_init_args=predictor_init_args, predictor_fit_args=predictor_fit_args
@@ -117,11 +117,11 @@ target="target"
 
 predictor_init_args = {
     "target": target
-}  # init args you would pass to AG TimeSeriesCloudPredictor
+}  # args used when creating TimeSeriesPredictor()
 predictor_fit_args = {
     "train_data": data,
     "time_limit": 120
-}  # fit args you would pass to AG TimeSeriesCloudPredictor
+}  # args passed to TimeSeriesPredictor.fit()
 cloud_predictor = TimeSeriesCloudPredictor(cloud_output_path="YOUR_S3_BUCKET_PATH")
 cloud_predictor.fit(
     predictor_init_args=predictor_init_args,
@@ -158,7 +158,7 @@ result = cloud_predictor.predict(
 pip install -U pip
 pip install -U setuptools wheel
 pip install --pre autogluon.cloud  # You don't need to install autogluon itself locally
-pip install --upgrade sagemaker  # This is required to ensure the information about newly released containers is available.
+pip install -U sagemaker  # This is required to ensure the information about newly released containers is available.
 ```
 
 ```{toctree}
@@ -191,4 +191,4 @@ hidden:
 TabularCloudPredictor <api/autogluon.cloud.TabularCloudPredictor>
 MultiModalCloudPredictor <api/autogluon.cloud.MultiModalCloudPredictor>
 TimeSeriesCloudPredictor <api/autogluon.cloud.TimeSeriesCloudPredictor>
-```
+```
diff --git a/docs/tutorials/autogluon-cloud.md b/docs/tutorials/autogluon-cloud.md
@@ -5,7 +5,10 @@ The containers can be used to train models with CPU and GPU instances and deploy
 
 We offer the [autogluon.cloud](https://github.com/autogluon/autogluon-cloud) module to utilize those containers and [AWS SageMaker](https://aws.amazon.com/sagemaker/) underneath to train/deploy AutoGluon backed models with simple APIs.
 
-**Costs for running cloud compute are managed by AWS SageMaker, and storage costs are managed by AWS S3. AutoGluon-Cloud is a wrapper to these services at no additional charge. While AutoGluon-Cloud makes an effort to simplify the usage of these services, it is ultimately the user's responsibility to monitor compute usage within their account to ensure no unexpected charges.**
+```{attention}
+Costs for running cloud compute are managed by AWS SageMaker, and storage costs are managed by AWS S3. AutoGluon-Cloud is a wrapper to these services at no additional charge. While AutoGluon-Cloud makes an effort to simplify the usage of these services, it is ultimately the user's responsibility to monitor compute usage within their account to avoid unexpected charges.
+```
+
 
 ## Installation
 `autogluon.cloud` does not come with the default `autogluon` installation. You can install it via:
@@ -17,34 +20,84 @@ pip3 install autogluon.cloud
 Also ensure that the latest version of sagemaker python API is installed via:
 
 ```bash
-pip3 install --upgrade sagemaker
+pip3 install -U sagemaker
 ```
 
 This is required to ensure the information about newly released containers is available.
 
 ## Prepare an AWS Role with Necessary Permissions
 `autogluon.cloud` utilizes various AWS resources to operate.
-To help you to setup the necessary permissions, you can generate trust relationship and iam policy with our utils through
+To help you to setup the necessary permissions, you can generate trust relationship and IAM policy with our utils through
 
 ```python
 from autogluon.cloud import TabularCloudPredictor  # Can be other CloudPredictor as well
 
 TabularCloudPredictor.generate_default_permission(
-    backend="BACKNED_YOU_WANT"  # We currently support sagemaker and ray_aws
+    backend="BACKNED_YOU_WANT",  # We currently support "sagemaker" and "ray_aws"
     account_id="YOUR_ACCOUNT_ID",  # The AWS account ID you plan to use for CloudPredictor.
     cloud_output_bucket="S3_BUCKET"  # S3 bucket name where intermediate artifacts will be uploaded and trained models should be saved. You need to create this bucket beforehand.
 )
 ```
 
-The util function above would give you two json files describing the trust replationship and the iam policy.
-**Make sure you review those files and make necessary changes according to your use case before applying them.**
+```{note}
+Make sure you review the trust relationship and IAM policy files, and make necessary changes according to your use case before applying them.
+```
+
+We recommend you to create an IAM Role for your IAM User to delegate as IAM Role doesn't have permanent long-term credentials and is used to directly interact with AWS services. Here is how this can be done using the AWS CLI.
+
+```{note}
+Make sure to replace `AUTOGLUON-ROLE-NAME` with your desired role name, `AUTOGLUON-POLICY-NAME` with your desired policy name, and `222222222222` with your AWS account number.
+```
+
+1. Create the IAM role.
+    ```bash
+    aws iam create-role --role-name AUTOGLUON-ROLE-NAME --assume-role-policy-document file://ag_cloud_sagemaker_trust_relationship.json
+    ```
+    This method will return the **role ARN** that looks similar to `arn:aws:iam::222222222222:role/AUTOGLUON-ROLE-NAME`. Keep it for further reference.
+
+2. Create the IAM policy.
+    ```bash
+    aws iam create-policy --policy-name AUTOGLUON-POLICY-NAME --policy-document file://ag_cloud_sagemaker_iam_policy.json
+    ```
+    This method will return the **policy ARN** that looks similar to `arn:aws:iam::222222222222:policy/AUTOGLUON-POLICY-NAME`. Keep it for further reference.
+
+3. Attach the IAM policy to the role.
+    ```bash
+    aws iam attach-role-policy --role-name AUTOGLUON-ROLE-NAME --policy-arn "arn:aws:iam::222222222222:policy/AUTOGLUON-POLICY-NAME"
+    ```
+
+4. Assume the IAM role using AWS CLI or boto3. 
+
+    <details><summary>AWS CLI</summary>
 
-We recommend you to create an IAM Role for your IAM User to delegate as IAM Role doesn't have permanent long-term credentials and is used to directly interact with AWS services.
-Refer to this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/iam-assume-role-cli/) to
+    See section "Assume the IAM role" in this [tutorial](https://repost.aws/knowledge-center/iam-assume-role-cli).
+
+    </details>
+
+    <details><summary>Python/boto3</summary>
+
+    ```python
+    import boto3
+    session = boto3.Session()
+    response = session.client("sts").assume_role(
+        RoleArn="arn:aws:iam::222222222222:role/AUTOGLUON-ROLE-NAME",
+        RoleSessionName="AutoGluonCloudSession",
+    )
+    credentials = response['Credentials']
+    boto3.setup_default_session(
+        aws_access_key_id=credentials['AccessKeyId'],
+        aws_secret_access_key=credentials['SecretAccessKey'],
+        aws_session_token=credentials['SessionToken'],
+    )
+    ```
+    Now when you use `autogluon.cloud` in the same Python script / Jupyter notebook, the correct IAM role will be used.
+
+    </details>
+
+
+
+For more details on setting up IAM roles and policies, refer to this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/iam-assume-role-cli/).
 
-1. create the IAM Role with the trust relationship and iam policy you generated above
-2. setup the credential
-3. assume the role
 
 ## Training
 Using `autogluon.cloud` to train AutoGluon backed models is simple and not too much different from training an AutoGluon predictor directly.
@@ -61,8 +114,8 @@ cloud_predictor = TabularCloudPredictor(
 ).fit(
     predictor_init_args=predictor_init_args,
     predictor_fit_args=predictor_fit_args,
-    instance_type="ml.m5.2xlarge"  # Checkout supported instance and pricing here: https://aws.amazon.com/sagemaker/pricing/
-    wait=True  # Set this to False to make it an unblocking call and immediately return
+    instance_type="ml.m5.2xlarge",  # Checkout supported instance and pricing here: https://aws.amazon.com/sagemaker/pricing/
+    wait=True,  # Set this to False to make it an unblocking call and immediately return
 )
 ```
 
@@ -86,7 +139,7 @@ If you want to deploy a predictor as a SageMaker endpoint, which can be used to
 ```python
 cloud_predictor.deploy(
     instance_type="ml.m5.2xlarge",  # Checkout supported instance and pricing here: https://aws.amazon.com/sagemaker/pricing/
-    wait=True  # Set this to False to make it an unblocking call and immediately return
+    wait=True,  # Set this to False to make it an unblocking call and immediately return
 )
 ```
 
@@ -182,7 +235,7 @@ result = cloud_predictor.predict(
     # If False, returns nothing. You will have to download results separately via cloud_predictor.download_predict_results
     download=True,
     persist=True,  # If True and download=True, the results file will also be saved to local disk.
-    save_path=None  # Path to save the downloaded results. If None, CloudPredictor will create one with the batch inference job name.
+    save_path=None,  # Path to save the downloaded results. If None, CloudPredictor will create one with the batch inference job name.
 )
 ```
 
@@ -200,14 +253,14 @@ To perform batch inference and getting prediction probability:
 ```python
 result = cloud_predictor.predict_proba(
     'test.csv',  # can be a DataFrame as well and the results will be stored in s3 bucket
-    include_predict=True  # Will return a tuple (prediction, prediction probability). Set this to False to get prediction probability only.
+    include_predict=True,  # Will return a tuple (prediction, prediction probability). Set this to False to get prediction probability only.
     instance_type="ml.m5.2xlarge",  # Checkout supported instance and pricing here: https://aws.amazon.com/sagemaker/pricing/
     wait=True,  # Set this to False to make it an unblocking call and immediately return
     # If True, returns a Pandas Series object of predictions.
     # If False, returns nothing. You will have to download results separately via cloud_predictor.download_predict_results
     download=True,
     persist=True,  # If True and download=True, the results file will also be saved to local disk.
-    save_path=None  # Path to save the downloaded results. If None, CloudPredictor will create one with the batch inference job name.
+    save_path=None,  # Path to save the downloaded results. If None, CloudPredictor will create one with the batch inference job name.
 )
 ```