Release Neuron SDK Release - September 15, 2023 · aws-neuron/aws-neuron-sdk

What’s New

This release introduces support for Llama-2-7B model training and T5-3B model inference using neuronx-distributed. It also adds support for Llama-2-13B model training using neuronx-nemo-megatron. Neuron 2.14 also adds support for Stable Diffusion XL(Refiner and Base) model inference using torch-neuronx . This release also introduces other new features, performance optimizations, minor enhancements and bug fixes. This release introduces the following:

Note

This release deprecates --model-type=transformer-inference compiler flag. Users are highly encouraged to migrate to the --model-type=transformer compiler flag.

What’s New	Details	Instances
AWS Neuron Reference for Nemo Megatron library (neuronx-nemo-megatron)	Llama-2-13B model training support ( tutorial ) ZeRO-1 Optimizer support that works with tensor parallelism and pipeline parallelism See more at AWS Neuron Reference for Nemo Megatron(neuronx-nemo-megatron) Release Notes and neuronx-nemo-megatron github repo	Trn1/Trn1n
Neuron Distributed (neuronx-distributed) for Training	pad_model API to pad attention heads that do not divide by the number of NeuronCores, this will allow users to use any supported tensor-parallel degree. See API Reference Guide (neuronx-distributed ) Llama-2-7B model training support (sample script) (tutorial) See more at Neuron Distributed Release Notes (neuronx-distributed) and API Reference Guide (neuronx-distributed )	Trn1/Trn1n
Neuron Distributed (neuronx-distributed) for Inference	T5-3B model inference support (tutorial) pad_model API to pad attention heads that do not divide by the number of NeuronCores, this will allow users to use any supported tensor-parallel degree. See API Reference Guide (neuronx-distributed ) See more at Neuron Distributed Release Notes (neuronx-distributed) and API Reference Guide (neuronx-distributed )	Inf2,Trn1/Trn1n
Transformers Neuron (transformers-neuronx) for Inference	Introducing --model-type=transformer compiler flag that deprecates --model-type=transformer-inference compiler flag. See more at Transformers Neuron (transformers-neuronx) release notes	Inf2, Trn1/Trn1n
PyTorch Neuron (torch-neuronx)	Performance optimizations in torch_neuronx.analyze API. See PyTorch Neuron (torch-neuronx) Analyze API for Inference Stable Diffusion XL(Refiner and Base) model inference support ( sample script)	Trn1/Trn1n,Inf2
Neuron Compiler (neuronx-cc)	New --O compiler option that enables different optimizations with tradeoff between faster model compile time and faster model execution. See more at Neuron Compiler CLI Reference Guide (neuronx-cc) See more at Neuron Compiler (neuronx-cc) release notes	Inf2/Trn1/Trn1n
Neuron Tools	Neuron SysFS support for showing connected devices on trn1.32xl, inf2.24xl and inf2.48xl instances. See Neuron Sysfs User Guide See more at Neuron System Tools	Inf1/Inf2/Trn1/Trn1n
Documentation Updates	Neuron Calculator now supports multiple model configurations for Tensor Parallel Degree computation. See Neuron Calculator Announcement to deprecate --model-type=transformer-inference flag. See Announcing deprecation for --model-type=transformer-inference compiler flag See more at Neuron Documentation Release Notes	Inf1, Inf2, Trn1/Trn1n
Minor enhancements and bug fixes.	See Neuron Components Release Notes	Trn1/Trn1n , Inf2, Inf1
Release Artifacts	see Release Artifacts	Trn1/Trn1n , Inf2, Inf1

For more detailed release notes of the new features and resolved issues, see Neuron Components Release Notes.

To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see Model Architecture Fit Guidelines.

[What’s New](https://awsdocs-neuron-staging.readthedocs-hosted.com/en/latest/release-notes/index.html#id7)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neuron SDK Release - September 15, 2023

What’s New