Neuron SDK Release - May 1, 2023
What’s New
This release introduces new features, performance optimizations, minor enhancements and bug fixes. This release introduces the following:
What’s New | Details | Instances |
---|---|---|
Initial support for computer vision models inference | Added Stable Diffusion 2.1 model script for Text to Image Generation, Added VGG model script for Image Classification Task, Added UNet model script for Image Segmentation Task, Please check aws-neuron-samples repository | Inf2, Trn1/Trn1n |
Profiling support in PyTorch Neuron(torch-neuronx) for Inference with TensorBoard | See more at Profiling PyTorch Neuron (torch-neuronx) with TensorBoard | Inf2, Trn1/Trn1n |
New Features and Performance Enhancements in transformers-neuronx | Support for the HuggingFace generate function, Model Serialization support including model saving, loading, and weight swapping, Improved prompt context encoding performance. See transformers_neuronx_readme for examples and usage, See more at Transformers Neuron (transformers-neuronx) release notes | Inf2, Trn1/Trn1n |
Support models larger than 2GB in TensorFlow 2.x Neuron (tensorflow-neuronx) | See Special Flags for details. (tensorflow-neuronx) | Trn1/Trn1n, Inf2 |
Support models larger than 2GB in TensorFlow 2.x Neuron (tensorflow-neuron) | See Special Flags for details. (tensorflow-neuron) | Inf1 |
Performance Enhancements in PyTorch C++ Custom Operators [Experimental] | Support for using multiple GPSIMD Cores in Custom C++ Operators, See Custom Operators (Experimental) | Trn1/Trn1n |
Weight Deduplication Feature (Inf1) | Support for Sharing weights when loading multiple instance versions of the same model on different NeuronCores.See more at Neuron Runtime Configuration | Inf1 |
nccom-test - Collective Communication Benchmarking Tool | Supports enabling benchmarking sweeps on various Neuron Collective Communication operations. See NCCOM-TEST (Beta) for more details. | Trn1/Trn1n , Inf2 |
Minor enhancements and bug fixes. | See Neuron Components Release Notes | Trn1/Trn1n , Inf2, Inf1 |
Release Artifacts | see Release Artifacts | Trn1/Trn1n , Inf2, Inf1 |
For more detailed release notes of the new features and resolved issues, see Neuron Components Release Notes.
To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see Model Architecture Fit Guidelines.