Skip to content

Neuron SDK Release - May 1, 2023

Compare
Choose a tag to compare
@awsjoshir awsjoshir released this 02 May 14:07
· 149 commits to master since this release
f5d2d79

What’s New

This release introduces new features, performance optimizations, minor enhancements and bug fixes. This release introduces the following:

What’s New Details Instances
Initial support for computer vision models inference Added Stable Diffusion 2.1 model script for Text to Image Generation, Added VGG model script for Image Classification Task, Added UNet model script for Image Segmentation Task, Please check aws-neuron-samples repository Inf2, Trn1/Trn1n
Profiling support in PyTorch Neuron(torch-neuronx) for Inference with TensorBoard See more at Profiling PyTorch Neuron (torch-neuronx) with TensorBoard Inf2, Trn1/Trn1n
New Features and Performance Enhancements in transformers-neuronx Support for the HuggingFace generate function, Model Serialization support including model saving, loading, and weight swapping, Improved prompt context encoding performance. See transformers_neuronx_readme for examples and usage, See more at Transformers Neuron (transformers-neuronx) release notes Inf2, Trn1/Trn1n
Support models larger than 2GB in TensorFlow 2.x Neuron (tensorflow-neuronx) See Special Flags for details. (tensorflow-neuronx) Trn1/Trn1n, Inf2
Support models larger than 2GB in TensorFlow 2.x Neuron (tensorflow-neuron) See Special Flags for details. (tensorflow-neuron) Inf1
Performance Enhancements in PyTorch C++ Custom Operators [Experimental] Support for using multiple GPSIMD Cores in Custom C++ Operators, See Custom Operators (Experimental) Trn1/Trn1n
Weight Deduplication Feature (Inf1) Support for Sharing weights when loading multiple instance versions of the same model on different NeuronCores.See more at Neuron Runtime Configuration Inf1
nccom-test - Collective Communication Benchmarking Tool Supports enabling benchmarking sweeps on various Neuron Collective Communication operations. See NCCOM-TEST (Beta) for more details. Trn1/Trn1n , Inf2
Minor enhancements and bug fixes. See Neuron Components Release Notes Trn1/Trn1n , Inf2, Inf1
Release Artifacts see Release Artifacts Trn1/Trn1n , Inf2, Inf1

For more detailed release notes of the new features and resolved issues, see Neuron Components Release Notes.

To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see Model Architecture Fit Guidelines.