Neuron SDK Release - February 13, 2024
What's New
Neuron 2.17 release improves small collective communication operators (smaller than 16MB) by up to 30%, which improves large language model (LLM) Inference performance by up to 10%. This release also includes improvements in :ref:`Neuron Profiler <neuron-profile-ug>` and other minor enhancements and bug fixes.
For more detailed release notes of the new features and resolved issues, see :ref:`components-rn`.
To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see :ref:`model_architecture_fit`.
Neuron Components Release Notes
Inf1, Trn1/Trn1n and Inf2 common packages
Component | Instance/s | Package/s | Details |
---|---|---|---|
Neuron Runtime | Trn1/Trn1n, Inf1, Inf2 | Trn1/Trn1n: aws-neuronx-runtime-lib (.deb, .rpm) Inf1: Runtime is linked into the ML frameworks packages | :ref:neuron-runtime-rn |
Neuron Runtime Driver | Trn1/Trn1n, Inf1, Inf2 | aws-neuronx-dkms (.deb, .rpm) | :ref:neuron-driver-release-notes |
Neuron System Tools | Trn1/Trn1n, Inf1, Inf2 | aws-neuronx-tools (.deb, .rpm) | :ref:neuron-tools-rn |
Containers | Trn1/Trn1n, Inf1, Inf2 | aws-neuronx-k8-plugin (.deb, .rpm) aws-neuronx-k8-scheduler (.deb, .rpm) aws-neuronx-oci-hooks (.deb, .rpm) | :ref:neuron-k8-rn :ref:neuron-containers-release-notes |
NeuronPerf (Inference only) | Trn1/Trn1n, Inf1, Inf2 | neuronperf (.whl) | :ref:neuronperf_rn |
TensorFlow Model Server Neuron | Trn1/Trn1n, Inf1, Inf2 | tensorflow-model-server-neuronx (.deb, .rpm) | :ref:tensorflow-modeslserver-neuronx-rn |
Neuron Documentation | Trn1/Trn1n, Inf1, Inf2 | :ref:neuron-documentation-rn |