Improved md formatting

ROCm · Dec 11, 2024 · deede6e · deede6e
1 parent f8d0ad9
commit deede6e
Showing 1 changed file with 66 additions and 69 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,80 +5,77 @@ Full documentation for MIGraphX is available at
 
 ## MIGraphX 2.11 for ROCm 6.3.0
 
-### Additions
-
-* Added initial code to run on Windows
-* Added support for gfx120x GPU 
-* Added support for FP8, and INT4
-* Added ONNX operators Bitwise And, Scan,SoftmaxCrossEntropyLoss, GridSample, NegativeLogLikelihoodLoss
-* Added Microsoft Contrib operators MatMulNBits, QuantizeLinear/DequantizeLinear, GroupQueryAttention, SkipSimplifiedLayerNormalization, and SimpliedLayerNormalization
-* Added support for the Log2 internal operator
-* Added Split-K as an optional performance improvement
-* Added support for the GCC 14 compiler
-* Added the ablity to call hipBlasLt libaries using environment variable MIGRAPHX_ENABLE_HIPBLASLT_GEMM=1
-* Added scripts to validate ONNX models from the ONNX Model Zoo
-* Enabled the OneHot operator to accept a dynamic batch parameter 
-* Added a --mlir flag the migraphx-driver program to offload entire module to mlir
-* Added GPU Pooling Kernel
-* Added examples for RNNT, and ControlNet
-* Disabled requirement for MIOpen, and rocBlas when running on Windows
-* Introduced fusing split-reduce with MLIR 
-* Allow multiple outputs for the MLIR + Pointwise fusions 
+### Added
+
+* Initial code to run on Windows
+* Support for gfx120x GPU
+* Support for FP8, and INT4
+* Support for the Log2 internal operator
+* Support for the GCC 14 compiler
+* The BitwiseAnd, Scan, SoftmaxCrossEntropyLoss, GridSample, and NegativeLogLikelihoodLoss ONNX operators
+* The MatMulNBits, QuantizeLinear/DequantizeLinear, GroupQueryAttention, SkipSimplifiedLayerNormalization, and SimpliedLayerNormalizationMicrosoft Contrib operators
+* Dymamic batch parameter support to OneHot operator
+* Split-K as an optional performance improvement
+* Scripts to validate ONNX models from the ONNX Model Zoo
+* GPU Pooling Kernel
+* --mlir flag the migraphx-driver program to offload entire module to mlir
+* Fusing split-reduce with MLIR
+* Multiple outputs for the MLIR + Pointwise fusions
 * Pointwise fusions with MLIR across reshape operations
-* Added reduce_any and reduce_all options from the Reduce operation via Torch MIGraphX
-* Added a flag to dump mlir modules to mxrs
-* Added MIGRAPHX_TRACE_BENCHMARKING=3 to print the MLIR program for improved debug output
-
-
-### Optimizations
-
-* Optimized the NHWC layout to improve performance of many convolution based models
-* Improved GPU utilization
-* Improved infrastructure code to enable better Kernel fusions with all supported data types
-* Optmized performance for the FP8 datatype 
+* MIGRAPHX_MLIR_DUMP environment variable to dump MLIR modules to MXRs
+* The 3 option to MIGRAPHX_TRACE_BENCHMARKING to print the MLIR program for improved debug output
+* MIGRAPHX_ENABLE_HIPBLASLT_GEMM environment variable to call hipBlasLt libaries
+* MIGRAPHX_VERIFY_DUMP_DIFF to improve the debugging of accuracy issues
+* reduce_any and reduce_all options to the Reduce operation via Torch MIGraphX
+* Examples for RNNT, and ControlNet
+
+
+### Changed
+
+* Switched to MLIR's 3D Convolution operator.
+* MLIR is now used for Attention operations by default on gfx942 and newer ASICs.
+* Names and locations for VRM specific libraries have changed.
+* Use random mode for benchmarking GEMMs and convolutions.
+* Python version is now printed with an actual version number.
+
+
+### Removed
+
+* Disabled requirements for MIOpen and rocBlas when running on Windows.
+* Removed inaccuracte warning messages when using exhaustive-tune.
+* Remove the hard coded path in MIGRAPHX_CXX_COMPILER allowing the compiler to be installed in different locations.
+
+### Optimized
+
+* Improved:
+    * Infrastructure code to enable better Kernel fusions with all supported data types
+    * Subsequent model compile time by creating a cache for already performant kernels
+    * Use of Attention fusion with models
+    * Performance of the Softmax JIT kernel and of the Pooling opterator
+    * Tuning operations through a new 50ms delay before running the next kernel
+    * Performance of several convolution based models through an optimized NHWC layout
+    * Performance for the FP8 datatype
+    * GPU utilization
+    * Verification tools
+    * Debug prints
+    * Documentation, including gpu-driver utility documentation
+    * Summary section of the migrahx-driver perf command
 * Reduced model compilation time
-* Improved subsequent model compile time by creating a cache for already performant kernels
-* Reorder some compiler passes to allow for more fusions
-* Improved the use of the Attention fusion with models
+* Reordered some compiler passes to allow for more fusions
 * Preloaded tiles into LDS to improve performance of pointwise transposes
-* Improved all documentation
-* Improved gpu-driver utility documentation
-* Improved performance of the Softmax JIT kernel 
-* Improved performance of the Pooling opterator
-* Improved debugging of accuracy issues by addig the environment variable MIGRAPHX_VERIFY_DUMP_DIFF 
-* Improved Tuning operations by adding a 50 ms delay before running the next kernel
-* Improved the summary section of the `migrahx-driver perf` command
-* Improved verification tools
-* Improved debug prints
 * Exposed the external_data_path property in onnx_options to set the path from onnxruntime
-* Remove the hard coded path in MIGRAPHX_CXX_COMPILER allowing the compiler to be installed in different locations
-
-
-### Fixes
-
-* Fixed a bug with gfx1030 that overwrote dpp reduce
-* Fixed a bug in 1arg dynamic reshape that created a failure 
-* Fixed a bug with dot_broadcast, and inner_broadcast that caused compile failures
-* Fixed a bug where some configs were failing when using exhaustive-tune
-* Fixed the ROCM Install Guide Url
-* Fixed an issue while building a whl package due to an apostrophe
-* Fixed the BERT Squad example requirements file to support different versions of Python
-* Fixed a bug that stopped the Vicuna model from compiling
-* Fixed failures with the verify option of migraphx-driver that would cause the application to exit early
-
-
-### Changes
-
-* Switched to use MLIR's 3D Convolution operator
-* Enabled MLIR to be used for Attention operations by default on gfx942 and newer asics
-* Adjusted name and location for VRM specific libraries
-* Use random mode for benchmarking gemm and convolutions
-* Changed how the python version is printed from `dev` to an actual version number
-
-
-### Removals
 
-* Removed inaccuracte warning messages when using exhaustive-tune
+### Resovled Issues
+
+* Fixed a bug with gfx1030 that overwrote dpp_reduce.
+* Fixed a bug in 1arg dynamic reshape that created a failure.
+* Fixed a bug with dot_broadcast and inner_broadcast that caused compile failures.
+* Fixed a bug where some configs were failing when using exhaustive-tune.
+* Fixed the ROCM Install Guide URL.
+* Fixed an issue while building a whl package due to an apostrophe.
+* Fixed the BERT Squad example requirements file to support different versions of Python.
+* Fixed a bug that stopped the Vicuna model from compiling.
+* Fixed failures with the verify option of migraphx-driver that would cause the application to exit early.
 
 
 ## MIGraphX 2.10 for ROCm 6.2.0