Release Neuron SDK Release - June 14, 2023 · aws-neuron/aws-neuron-sdk

What’s New

This release introduces Neuron Distributed, a new python library to simplify training and inference of large models, improving usability with features like S3 model caching, standalone profiler tool, support for Ubuntu22, as well as other new features, performance optimizations, minor enhancements and bug fixes. This release introduces the following:

What’s New	Details	Instances
New Features and Performance Enhancements in transformers-neuronx	Support for int8 inference. See example at int8 weight storage supportImproved prompt context encoding performance. See more at Transformers Neuron (transformers-neuronx) Developer GuideImproved collective communications performance for Tensor Parallel inference on Inf2 and Trn1.See more at Transformers Neuron (transformers-neuronx) release notes	Inf2, Trn1/Trn1n
Neuron Profiler Tool	Support for as a stand alone tool to profile and get visualized insights on execution of models on Trainium and Inferentia devices.See more at Neuron Profile User Guide	Inf1, Inf2, Trn1/Trn1n
Neuron Compilation Cache through S3	Support for sharing compiled models across Inf2 and Trn1 nodes through S3See more at PyTorch Neuron neuron_parallel_compile CLI (torch-neuronx)	Inf2, Trn1/Trn1n
New script to scan a model for supported/unsupported operators	Script to scan a model for supported/unsupported operators before training, scan output includes supported and unsupported operators at both XLA operators and PyTorch operators level.See a sample tutorial at Analyze for Training Tutorial	Inf2, Trn1/Trn1n
Neuron Distributed Library [Experimental]	New Python Library based on PyTorch enabling distributed training and inference of large models.Initial support for tensor-parallelism.See more at Neuron Distributed [Experimental]	Inf2, Trn1/Trn1n
Neuron Calculator and Documentation Updates	New Neuron Calculator Documentation section to help determine number of Neuron Cores needed for LLM Inference.Added App Note Generative LLM inference with NeuronSee more at Neuron Documentation Release Notes	Inf1, Inf2, Trn1/Trn1n
Enhancements to Neuron SysFS	Support for detailed breakdown of memory usage across the NeuronCoresSee more at Neuron Sysfs User Guide	Inf1, Inf2, Trn1/Trn1n
Support for Ubuntu 22	See more at Setup Guide for setup instructions on Ubuntu22	Inf1, Inf2, Trn1/Trn1n
Minor enhancements and bug fixes.	See Neuron Components Release Notes	Trn1/Trn1n , Inf2, Inf1
Release Artifacts	see Release Artifacts	Trn1/Trn1n , Inf2, Inf1

For more detailed release notes of the new features and resolved issues, see Neuron Components Release Notes.

To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see Model Architecture Fit Guidelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neuron SDK Release - June 14, 2023

What’s New