v2.7.0
Intel® Optimizations for TensorFlow 2.7.0
This release of Intel® Optimized TensorFlow is based on the TensorFlow v2.7.0 tag and is built with support for oneDNN (oneAPI Deep Neural Network Library). For features and fixes that were introduced in TensorFlow 2.7.0, please see the TensorFlow 2.7.0 release notes also. This build was built from v2.7.0.
This release note covers both Intel® Optimizations for TensorFlow* and official TensorFlow v2.7.0 with oneDNN enabled (via setting the environment variable TF_ENABLE_ONEDNN_OPTS to 1).
Major features:
• Please see the TensorFlow 2.7.0 release notes
• Supported platforms: Linux and Windows 10.
Improvements:
• Updated oneDNN to version 2.4.1
• Improved Bfloat16 performance for element-wise Eigen operations
• Improved the performance for TensorFlow Saved Models
• Added additional fusions, e.g. matmul-biasadd-gelu
• Marked nodeDef with '_kernel' attribute as NameChange label before inferring the device and inside WrapInCallOp
• Enabled simple heuristic-based tuning for innerproduct primitive
• Disabled rewrite conv_grad ops to MKL with explicit padding
• Added sanity check for the corner case of "zero element of filter" in mkl_conv_ops.cc
• Enhanced PluggableDevice support
- Added DEVICE_DEFAULT for python ops
- Enabled the TensorList with DEVICE_DEFAULT including memzero, memset and memset32
- Fixed While segment fault on PluggableDevice
- Added DEVICE_DEFAULT for collective/bcast ops
- Added DEVICE_DEFAULT for session/transpose ops
Bug fixes:
• Issues resolved in TensorFlow 2.7
• oneDNN resolved issues. 2.4.1 resolved issues
• Updates curl to 7.79.1 to handle CVE-2021-22947, CVE-2021-22946, CVE-2021-22945
• Static scan analysis findings are all fixed.
• Fixed a bug inside pattern matcher for grappler due to not considering the nodes_to_preserve in the remapper use case
• Fixed tensorflow/python/framework/node_file_writer_test failure caused by op rewrite with different op name
• Fixed XByak-induced crashes on non-Intel systems
• Fixed missing-device unit test failures
Versions and components:
• Intel optimized TensorFlow based on TensorFlow v2.7.0: https://github.com/Intel-tensorflow/tensorflow/tree/v2.7.0
• TensorFlow v2.7.0: https://github.com/tensorflow/tensorflow/tree/v2.7.0
• oneDNN: https://github.com/oneapi-src/oneDNN/releases/tag/v2.4.1
• Model Zoo: https://github.com/IntelAI/models
Known issues
• Open issues: open issues for oneDNN optimizations
• Bfloat16 is not guaranteed to work on AVX or AVX2
• Transformer-LT model can see performance degradation of up to 18% as compared to Intel TensorFlow v2.6.0
• Wide-and-Deep model can have performance degradation of up to 14% as compared to Intel TensorFlow v2.6.0
• In Windows OS, to use oneDNN enabled TensorFlow, users need to run “set TF_ENABLE_ONEDNN_OPTS=1”. Also, if the PC has hyperthreading enabled, users need to bind the ML application to one logical core per CPU in order to get the best runtime performance.