diff --git a/docs/operation.rst b/docs/operation.rst index d2f9961..0d027ac 100644 --- a/docs/operation.rst +++ b/docs/operation.rst @@ -8,3 +8,5 @@ Operation ./operation/mobile.rst ./operation/stationary.rst ./operation/data_collection.rst + ./operation/training.rst + ./operation/hugging_face.rst diff --git a/docs/operation/hugging_face.rst b/docs/operation/hugging_face.rst new file mode 100644 index 0000000..1f444ec --- /dev/null +++ b/docs/operation/hugging_face.rst @@ -0,0 +1,127 @@ +================== +Hugging Face Guide +================== + +Uploading and Downloading Datasets on Hugging Face +================================================== + +Creating an Account +------------------- + +If you don't already have an account, sign up for a new account on the `Hugging Face Sign Up `_. + +Creating a New Dataset Repository +--------------------------------- + +Web Interface +^^^^^^^^^^^^^ + +#. Navigate to the `Hugging Face website `_. +#. Log in to your account. +#. Click on your profile picture in the top-right corner and select "New dataset." +#. Follow the on-screen instructions to create a new dataset repository. + +Command Line Interface (CLI) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. Ensure you have the `huggingface_hub `_ library installed. +#. Use the following Python script to create a new repository: + + .. code-block:: python + + from huggingface_hub import HfApi + api = HfApi() + + api.create_repo(repo_id="username/repository_name", repo_type="dataset") + +For more information on creating repositories, refer to the `Hugging Face Repositories `_. + +Uploading Your Dataset +---------------------- + +You have two primary methods to upload datasets: through the web interface or using the Python API. + +Web Interface +^^^^^^^^^^^^^ + +#. Navigate to your dataset repository on the Hugging Face website. +#. Click on the "Files and versions" tab. +#. Drag and drop your dataset files into the files section. +#. Click "Commit changes" to save the files in the repository. + +Python API +^^^^^^^^^^ + +You can use the following Python script to upload your dataset: + +.. code-block:: python + + from huggingface_hub import HfApi + api = HfApi() + + api.upload_folder( + folder_path="path/to/dataset", + repo_id="username/repository_name", + repo_type="dataset", + ) + +**Example**: + +.. code-block:: python + + from huggingface_hub import HfApi + api = HfApi() + + api.upload_folder( + folder_path="~/aloha_data/aloha_stationary_block_pickup", + repo_id="TrossenRoboticsCommunity/aloha_static_datasets", + repo_type="dataset", + ) + +For more information on uploading datasets, refer to the `Hugging Face Uploading `_. + +Downloading Datasets +-------------------- + +You can download datasets either by cloning the repository or using the Hugging Face CLI. + +Cloning the Repository +^^^^^^^^^^^^^^^^^^^^^^ + +To clone the repository, use the following command: + +.. code-block:: bash + + $ git clone https://huggingface.co/datasets/username/repository_name + +Using the Hugging Face CLI +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You can also use the Hugging Face CLI to download datasets with the following Python script: + + .. code-block:: python + + from huggingface_hub import snapshot_download + + # Download the dataset + snapshot_download( + repo_id="username/repository_name", + repo_type="dataset", + local_dir="path/to/local/directory", + allow_patterns="*.hdf5" + ) + +.. note:: + + - The dataset episodes are stored in ``.hdf5`` format. Therefore, ensure that you only allow these patterns during download. + +For more information on downloading datasets, refer to the `Hugging Face Datasets `_. + +Additional Information +---------------------- + +- **Repository Management**: Utilize the `Hugging Face Hub documentation `_ for detailed instructions on managing repositories, handling versions, and setting permissions. +- **Dataset Formats**: Hugging Face supports various dataset formats. For this guide, we specifically use the Aloha's native ``.hdf5`` format. +- **Community Support**: If you encounter any issues, refer to the `Hugging Face community forums `_ for additional support. + +By following this guide, you should be able to seamlessly upload and download datasets using the Hugging Face platform. For more detailed guides and examples, refer to the `Hugging Face Documentation `_. diff --git a/docs/operation/training.rst b/docs/operation/training.rst new file mode 100644 index 0000000..7e00245 --- /dev/null +++ b/docs/operation/training.rst @@ -0,0 +1,237 @@ +======================= +Training and Evaluation +======================= + +Virtual Environment Setup +========================= + +Effective containerization is important when it comes to running machine learning models as there can be conflicting dependencies. +You can either use a Virtual Environment or Conda. + +Virtual Environment Installation and Setup +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. Install the virtual environment package: + + .. code-block:: bash + + $ sudo apt-get install python3-venv + +#. Create a virtual environment: + + .. code-block:: bash + + $ python3 -m venv ~/act # Creates a venv "act" in the home directory, can be created anywhere + +#. Activate the virtual environment: + + .. code-block:: bash + + $ source act/bin/activate + +Conda Setup +^^^^^^^^^^^ + +#. Create a virtual environment: + + .. code-block:: bash + + $ conda create -n aloha python=3.8.10 + +#. Activate the virtual environment: + + .. code-block:: bash + + $ conda activate aloha + +Install Dependencies +^^^^^^^^^^^^^^^^^^^^ + +Install the necessary dependencies inside your containerized environment: + +.. code-block:: bash + + $ pip install dm_control==1.0.14 + $ pip install einops + $ pip install h5py + $ pip install ipython + $ pip install matplotlib + $ pip install mujoco==2.3.7 + $ pip install opencv-python + $ pip install packaging + $ pip install pexpect + $ pip install pyquaternion + $ pip install pyyaml + $ pip install rospkg + $ pip install torch + $ pip install torchvision + +Clone Repository +================ + +Clone ACT if using Aloha Stationary + +.. code-block:: bash + + $ cd ~ + $ git clone https://github.com/Interbotix/act.git act_training_evaluation + +Clone ACT++ if using Aloha Mobile + +.. code-block:: bash + + $ cd ~ + $ git clone https://github.com/Interbotix/act_plus_plus.git act_training_evaluation + +Build and Install ACT Models +============================ + +.. code-block:: bash + :emphasize-lines: 4 + + ├── act + │   ├── assets + │   ├── constants.py + │   ├── detr + │   ├── ee_sim_env.py + │   ├── imitate_episodes.py + │   ├── __init__.py + │   ├── policy.py + │   ├── record_sim_episodes.py + │   ├── scripted_policy.py + │   ├── sim_env.py + │   ├── utils.py + │   └── visualize_episodes.py + ├── COLCON_IGNORE + ├── conda_env.yaml + ├── LICENSE + └── README.md + + +Navigate to the ``detr`` directory inside the repository and install the detr module whihc contains the model definitions using the below command: + +.. code-block:: bash + + $ cd /path/to/act/detr && pip install -e . + +Training +======== + +To start the training, follow the steps below: + +#. Sanity Check: + + Ensure you have all the hdf5 episodes located in the correct folder after following the data collection steps :ref:`operation/data_collection:Task Creation`. + +#. Source ROS Environment: + + .. code-block:: bash + + $ source /opt/ros/humble/setup.bash + $ source interbotix_ws/install/setup.bash + +#. Activate Virtual Environment: + + .. code-block:: bash + + $ source act/bin/activate + +#. Start Training + + .. code-block:: bash + + $ cd /path/to/act/repository/ + $ python3 imitate_episodes.py \ + --task_name aloha_stationary_dummy \ + --ckpt_dir \ + --policy_class ACT \ + --kl_weight 10 \ + --chunk_size 100 \ + --hidden_dim 512 \ + --batch_size 8 \ + --dim_feedforward 3200 \ + --num_epochs 2000 \ + --lr 1e-5 \ + --seed 0 + +.. note:: + + - ``task_name`` argument should match one of the task names in the ``TASK_CONFIGS``, as configured in the :ref:`operation/data_collection:Task Creation` section. + - ``ckpt_dir``: The relative location where the checkpoints and best policy will be stored. + - ``policy_class``: Determines the choice of policy 'ACT'/'CNNMLP'. + - ``kl_weight``: Controls the balance between exploration and exploitation. + - ``chunk_size``: Determines the length of the action sequence. K=1 is no action chunking and K=episode length is full open loop control. + - ``batch_size``: Low batch size leads to better generalization and high batch size results in slower convergence but faster training time. + - ``num_epochs``: Too many epochs lead to overfitting; too few epochs may not allow the model to learn. + - ``lr``: Higher learning rate can lead to faster convergence but may overshoot the optima, while lower learning rate might lead to slower but stable optimization. + + +.. tip:: + + We recommend the following parameters: + + .. list-table:: + :align: center + :widths: 25 75 + :header-rows: 1 + + * - Parameter + - Value + * - Policy Class + - ACT + * - KL Weight + - 10 + * - Chunk Size + - 100 + * - Batch Size + - 2 + * - Num of Epochs + - 3000 + * - Learning Rate + - 1e-5 + +Evaluation +========== + +To evaluate a trained model, follow the steps below: + +#. Bring up the ALOHA + + - Stationary: :ref:`operation/stationary:Running ALOHA Bringup` + - Mobile: :ref:`operation/mobile:Running ALOHA Bringup` + +#. Configure the environment + + .. code-block:: bash + + $ source /opt/ros/humble/setup.bash # Configure ROS system install environment + $ source interbotix_ws/install/setup.bash # Configure ROS workspace environment + $ source //bin/activate # Configure ALOHA Python environment + $ cd ~//act/ + +#. Run the evaluation script + + .. code-block:: bash + :emphasize-lines: 13-14 + + python3 imitate_episodes.py \ + --task_name aloha_stationary_dummy \ + --ckpt_dir \ + --policy_class ACT \ + --kl_weight 10 \ + --chunk_size 100 \ + --hidden_dim 512 \ + --batch_size 8 \ + --dim_feedforward 3200 \ + --num_epochs 2000 \ + --lr 1e-5 \ + --seed 0 \ + --eval \ + --temporal_agg + +.. note:: + + - The ``task_name`` argument should match one of the task names in the ``TASK_CONFIGS``, as configured in the :ref:`operation/data_collection:Task Creation` section. + - The ``ckpt_dir`` argument should match the correct relative directory location of the trained policy. + - The ``eval`` flag will set the script into evaluation mode. + - The ``temporal_agg`` is not required, but helps to smoothen the trajectory of the robots. \ No newline at end of file