Skip to content

Commit

Permalink
XAeroNet (#692)
Browse files Browse the repository at this point in the history
* adding xaeronet-s model

* add validation plots

* xaeronet-v model

* formatting

* update changelog

* remove json file

* address review comments

* multi-scale support, minor fixes
  • Loading branch information
mnabian authored Nov 5, 2024
1 parent 297297e commit 3f7a8a4
Show file tree
Hide file tree
Showing 23 changed files with 3,128 additions and 8 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Bistride Multiscale MeshGraphNet example.
- FIGConvUNet model and example.
- The Transolver model.
- The XAeroNet model.
- Incoporated CorrDiff-GEFS-HRRR model into CorrDiff, with lead-time aware SongUNet and
cross entropy loss.

Expand Down
Binary file added docs/img/xaeronet_s_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/xaeronet_v_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
164 changes: 164 additions & 0 deletions examples/cfd/xaeronet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# XAeroNet: Scalable Neural Models for External Aerodynamics

XAeroNet is a collection of scalable models for large-scale external
aerodynamic evaluations. It consists of two models, XAeroNet-S and XAeroNet-V for
surface and volume predictions, respectively.

## Problem overview

External aerodynamics plays a crucial role in the design and optimization of vehicles,
aircraft, and other transportation systems. Accurate predictions of aerodynamic
properties such as drag, pressure distribution, and airflow characteristics are
essential for improving fuel efficiency, vehicle stability, and performance.
Traditional approaches, such as computational fluid dynamics (CFD) simulations,
are computationally expensive and time-consuming, especially when evaluating multiple
design iterations or large datasets.

XAeroNet addresses these challenges by leveraging neural network-based surrogate
models to provide fast, scalable, and accurate predictions for both surface-level
and volume-level aerodynamic properties. By using the DrivAerML dataset, which
contains high-fidelity CFD data for a variety of vehicle geometries, XAeroNet aims
to significantly reduce the computational cost while maintaining high prediction
accuracy. The two models in XAeroNet—XAeroNet-S for surface predictions and XAeroNet-V
for volume predictions—enable rapid aerodynamic evaluations across different design
configurations, making it easier to incorporate aerodynamic considerations early in
the design process.

## Model Overview and Architecture

### XAeroNet-S

XAeroNet-S is a scalable MeshGraphNet model that partitions large input graphs into
smaller subgraphs to reduce training memory overhead. Halo regions are added to these
subgraphs to prevent message-passing truncations at the boundaries. Gradient aggregation
is employed to accumulate gradients from each partition before updating the model parameters.
This approach ensures that training on partitions is equivalent to training on the entire
graph in terms of model updates and accuracy. Additionally, XAeroNet-S does not rely on
simulation meshes for training and inference, overcoming a significant limitation of
GNN models in simulation tasks.

The input to the training pipeline is STL files, from which the model samples a point cloud
on the surface. It then constructs a connectivity graph by linking the N nearest neighbors.
This method also supports multi-mesh setups, where point clouds with different resolutions
are generated, their connectivity graphs are created, and all are superimposed. The Metis
library is used to partition the graph for efficient training.

For the XAeroNet-S model, STL files are used to generate point clouds and establish graph
connectivity. Additionally, the .vtp files are used to interpolate the solution fields onto
the point clouds.

### XAeroNet-V

XAeroNet-V is a scalable 3D UNet model with attention gates, designed to partition large
voxel grids into smaller sub-grids to reduce memory overhead during training. Halo regions
are added to these partitions to avoid convolution truncations at the boundaries.
Gradient aggregation is used to accumulate gradients from each partition before updating
the model parameters, ensuring that training on partitions is equivalent to training on
the entire voxel grid in terms of model updates and accuracy. Additionally, XAeroNet-V
incorporates a continuity constraint as an additional loss term during training to
enhance model interpretability.

For the XAeroNet-V model, the .vtu files are used to interpolate the volumetric
solution fields onto a voxel grid, while the .stl files are utilized to compute
the signed distance field (SDF) and its derivatives on the voxel grid.

## Dataset

We trained our models using the DrivAerML dataset from the [CAE ML Dataset collection](https://caemldatasets.org/drivaerml/).
This high-fidelity, open-source (CC-BY-SA) public dataset is specifically designed
for automotive aerodynamics research. It comprises 500 parametrically morphed variants
of the widely utilized DrivAer notchback generic vehicle. Mesh generation and scale-resolving
computational fluid dynamics (CFD) simulations were executed using consistent and validated
automatic workflows that represent the industrial state-of-the-art. Geometries and comprehensive
aerodynamic data are published in open-source formats. For more technical details about this
dataset, please refer to their [paper](https://arxiv.org/pdf/2408.11969).

## Training the XAeroNet-S model

To train the XAeroNet-S model, follow these steps:

1. Download the DrivAer ML dataset using the provided `download_aws_dataset.sh` script.

2. Navigate to the `surface` folder.

3. Specify the configurations in `conf/config.yaml`. Make sure path to the dataset
is specified correctly.

4. Run `combine_stl_solids.py`. The STL files in the DriveML dataset consist of multiple
solids. Those should be combined into a single solid to properly generate a surface point
cloud using the Modulus Tesselated geometry module.

5. Run `preprocessing.py`. This will prepare and save the partitioned graphs.

6. Create a `partitions_validation` folder, and move the samples you wish to use for
validation to that folder.

7. Run `compute_stats.py` to compute the global mean and standard deviation from the
training samples.

8. Run `train.py` to start the training.

9. Download the validation results (saved in form of point clouds in `.vtp` format),
and visualize in Paraview.

![XAeroNet-S Validation results for the sample #500.](../../../docs/img/xaeronet_s_results.png)

## Training the XAeroNet-V model

To train the XAeroNet-V model, follow these steps:

1. Download the DrivAer ML dataset using the provided `download_aws_dataset.sh` script.

2. Navigate to the `volume` folder.

3. Specify the configurations in `conf/config.yaml`. Make sure path to the dataset
is specified correctly.

4. Run `preprocessing.py`. This will prepare and save the voxel grids.

5. Create a `drivaer_aws_h5_validation` folder, and move the samples you wish to
use for validation to that folder.

6. Run `compute_stats.py` to compute the global mean and standard deviation from
the training samples.

7. Run `train.py` to start the training. Partitioning is performed prior to training.

8. Download the validation results (saved in form of voxel grids in `.vti` format),
and visualize in Paraview.

![XAeroNet-V Validation results.](../../../docs/img/xaeronet_v_results.png)

## Logging

We mainly use TensorBoard for logging training and validation losses, as well as
the learning rate during training. You can also optionally use Weight & Biases to
log training metrics. To visualize TensorBoard running in a
Docker container on a remote server from your local desktop, follow these steps:

1. **Expose the Port in Docker:**
Expose port 6006 in the Docker container by including
`-p 6006:6006` in your docker run command.

2. **Launch TensorBoard:**
Start TensorBoard within the Docker container:

```bash
tensorboard --logdir=/path/to/logdir --port=6006
```

3. **Set Up SSH Tunneling:**
Create an SSH tunnel to forward port 6006 from the remote server to your local machine:

```bash
ssh -L 6006:localhost:6006 <user>@<remote-server-ip>
```

Replace `<user>` with your SSH username and `<remote-server-ip>` with the IP address
of your remote server. You can use a different port if necessary.

4. **Access TensorBoard:**
Open your web browser and navigate to `http://localhost:6006` to view TensorBoard.

**Note:** Ensure the remote server’s firewall allows connections on port `6006`
and that your local machine’s firewall allows outgoing connections.
48 changes: 48 additions & 0 deletions examples/cfd/xaeronet/cleanup_corrupted_downloads.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash

# This is a Bash script designed to identify and remove corrupted files after downloading the AWS DrivAer dataset.
# The script defines two functions: check_and_remove_corrupted_extension and check_all_runs.
# The check_and_remove_corrupted_extension function checks for files in a given directory that have extra characters after their extension.
# If such a file is found, it is considered corrupted, and the function removes it.
# The check_all_runs function iterates over all directories in a specified local directory (LOCAL_DIR), checking for corrupted files with the extensions ".vtu", ".stl", and ".vtp".
# The script begins the cleanup process by calling the check_all_runs function. The target directory for this operation is set as "./drivaer_data_full".

# Set the local directory to check the files
LOCAL_DIR="./drivaer_data_full" # <--- This is the directory where the files are downloaded.

# Function to check if a file has extra characters after the extension and remove it
check_and_remove_corrupted_extension() {
local dir=$1
local base_filename=$2
local extension=$3

# Find any files with extra characters after the extension
for file in "$dir/$base_filename"$extension*; do
if [[ -f "$file" && "$file" != "$dir/$base_filename$extension" ]]; then
echo "Corrupted file detected: $file (extra characters after extension), removing it."
rm "$file"
fi
done
}

# Function to go over all the run directories and check files
check_all_runs() {
for RUN_DIR in "$LOCAL_DIR"/run_*; do
echo "Checking folder: $RUN_DIR"

# Check for corrupted .vtu files
base_vtu="volume_${RUN_DIR##*_}"
check_and_remove_corrupted_extension "$RUN_DIR" "$base_vtu" ".vtu"

# Check for corrupted .stl files
base_stl="drivaer_${RUN_DIR##*_}"
check_and_remove_corrupted_extension "$RUN_DIR" "$base_stl" ".stl"

# Check for corrupted .vtp files
base_stl="drivaer_${RUN_DIR##*_}"
check_and_remove_corrupted_extension "$RUN_DIR" "$base_stl" ".vtp"
done
}

# Start checking
check_all_runs
64 changes: 64 additions & 0 deletions examples/cfd/xaeronet/download_aws_dataset.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

# This Bash script downloads the AWS DrivAer files from the Amazon S3 bucket to a local directory.
# Only the volume files (.vtu), STL files (.stl), and VTP files (.vtp) are downloaded.
# It uses a function, download_run_files, to check for the existence of three specific files (".vtu", ".stl", ".vtp") in a run directory.
# If a file doesn't exist, it's downloaded from the S3 bucket. If it does exist, the download is skipped.
# The script runs multiple downloads in parallel, both within a single run and across multiple runs.
# It also includes checks to prevent overloading the system by limiting the number of parallel downloads.

# Set the local directory to download the files
LOCAL_DIR="./drivaer_data_full" # <--- This is the directory where the files will be downloaded.

# Set the S3 bucket and prefix
S3_BUCKET="caemldatasets"
S3_PREFIX="drivaer/dataset"

# Create the local directory if it doesn't exist
mkdir -p "$LOCAL_DIR"

# Function to download files for a specific run
download_run_files() {
local i=$1
RUN_DIR="run_$i"
RUN_LOCAL_DIR="$LOCAL_DIR/$RUN_DIR"

# Create the run directory if it doesn't exist
mkdir -p "$RUN_LOCAL_DIR"

# Check if the .vtu file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/volume_$i.vtu" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/volume_$i.vtu" "$RUN_LOCAL_DIR/" &
else
echo "File volume_$i.vtu already exists, skipping download."
fi

# Check if the .stl file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/drivaer_$i.stl" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/drivaer_$i.stl" "$RUN_LOCAL_DIR/" &
else
echo "File drivaer_$i.stl already exists, skipping download."
fi

# Check if the .vtp file exists before downloading
if [ ! -f "$RUN_LOCAL_DIR/boundary_$i.vtp" ]; then
aws s3 cp --no-sign-request "s3://$S3_BUCKET/$S3_PREFIX/$RUN_DIR/boundary_$i.vtp" "$RUN_LOCAL_DIR/" &
else
echo "File boundary_$i.vtp already exists, skipping download."
fi

wait # Ensure that both files for this run are downloaded before moving to the next run
}

# Loop through the run folders and download the files
for i in $(seq 1 500); do
download_run_files "$i" &

# Limit the number of parallel jobs to avoid overloading the system
if (( $(jobs -r | wc -l) >= 8 )); then
wait -n # Wait for the next background job to finish before starting a new one
fi
done

# Wait for all remaining background jobs to finish
wait
1 change: 1 addition & 0 deletions examples/cfd/xaeronet/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
trimesh==4.5.0
91 changes: 91 additions & 0 deletions examples/cfd/xaeronet/surface/combine_stl_solids.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# SPDX-FileCopyrightText: Copyright (c) 2023 - 2024 NVIDIA CORPORATION & AFFILIATES.
# SPDX-FileCopyrightText: All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
This module provides functionality to convert STL files with multiple solids
to another STL file with a single combined solid. It includes support for
processing multiple files in parallel with progress tracking.
"""

import os
import trimesh
import hydra

from multiprocessing import Pool
from tqdm import tqdm
from hydra.utils import to_absolute_path
from omegaconf import DictConfig


def process_stl_file(task):
stl_path = task

# Load the STL file using trimesh
mesh = trimesh.load_mesh(stl_path)

# If the STL file contains multiple solids (as a Scene object)
if isinstance(mesh, trimesh.Scene):
# Extract all geometries (solids) from the scene
meshes = list(mesh.geometry.values())

# Combine all the solids into a single mesh
combined_mesh = trimesh.util.concatenate(meshes)
else:
# If it's a single solid, no need to combine
combined_mesh = mesh

# Prepare the output file path (next to the original file)
base_name, ext = os.path.splitext(stl_path)
output_file_path = to_absolute_path(f"{base_name}_single_solid{ext}")

# Save the new combined mesh as an STL file
combined_mesh.export(output_file_path)

return f"Processed: {stl_path} -> {output_file_path}"


def process_directory(data_path, num_workers=16):
"""Process all STL files in the given directory using multiprocessing with progress tracking."""
tasks = []
for root, _, files in os.walk(data_path):
stl_files = [f for f in files if f.endswith(".stl")]
for stl_file in stl_files:
stl_path = os.path.join(root, stl_file)

# Add the STL file to the tasks list (no need for output dir, saving next to the original)
tasks.append(stl_path)

# Use multiprocessing to process the tasks with progress tracking
with Pool(num_workers) as pool:
for _ in tqdm(
pool.imap_unordered(process_stl_file, tasks),
total=len(tasks),
desc="Processing STL Files",
unit="file",
):
pass


@hydra.main(version_base="1.3", config_path="conf", config_name="config")
def main(cfg: DictConfig) -> None:
# Process the directory with multiple STL files
process_directory(
to_absolute_path(cfg.data_path), num_workers=cfg.num_preprocess_workers
)


if __name__ == "__main__":
main()
Loading

0 comments on commit 3f7a8a4

Please sign in to comment.