-
Notifications
You must be signed in to change notification settings - Fork 3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update XNNPACK to latest version (#18038)
### Description <!-- Describe your changes. --> Update XNNPACK to latest version - adds fp16 kernels and various other improvements - requires pthreadpool update as well Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API - 'setup' is split into 'reshape' and 'setup' - some ops use a workspace buffer - copied workspace allocation from XNNPACK unit test code - some suffixes changed Added wrapper for XNNPACK caches to base XNNPACK EP kernel - simplifies usage - XNNPACK split out the code and weights caches, but the code cache isn't currently usable via the public API - we could use the internal types if we think it's required for performance reasons. non-trivial though as we'd need to propagate ifdef values from the XNNPACK build up to the ORT build. - using XNNPACK internals would also mean we would not be able to support using a pre-build XNNPACK package - not an issue currently Fixed opset registration for internal NHWC domain - was not being tied to the ONNX version, so nodes inserted by layout transformation had the incorrect opset - a number of other places needed updating once this issue was fixed Remove support for NCHW Resize from XNNPACK EP so it's NHWC only - we only supported NCHW for fp32, - doing so adds complexity in multiple places (XNNPACK EP kernel implementation, layout transformation and transpose optimization) - unclear if that complexity provides any benefit. can add back if required by production scenario ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> We're looking at enabling fp16 support for CoreML and NNAPI. If we do that we need a good fallback story if the CPU EP will be used. The XNNPACK fp16 kernels will hopefully provide that. NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That can be done as required in separate EPs and should be relatively simple to do.
- Loading branch information
1 parent
e36d003
commit 4f2096b
Showing
45 changed files
with
794 additions
and
517 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,66 +1,27 @@ | ||
diff --git a/CMakeLists.txt b/CMakeLists.txt | ||
index d53c48aa1..77c3cf983 100755 | ||
index dba9b4687..bcaa18ad7 100755 | ||
--- a/CMakeLists.txt | ||
+++ b/CMakeLists.txt | ||
@@ -105,22 +105,12 @@ ENDIF() | ||
|
||
@@ -122,7 +122,7 @@ ENDIF() | ||
# ---[ Build flags | ||
IF(NOT CMAKE_SYSTEM_NAME) | ||
MESSAGE(FATAL_ERROR "CMAKE_SYSTEM_NAME not defined") | ||
-ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Darwin|Linux|Android|Windows|CYGWIN|MSYS)$") | ||
+ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Darwin|Linux|Android|Windows|CYGWIN|MSYS|Emscripten|iOS)$") | ||
MESSAGE(FATAL_ERROR "Unrecognized CMAKE_SYSTEM_NAME = ${CMAKE_SYSTEM_NAME}") | ||
-ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Android|Darwin|iOS|Linux|Windows|CYGWIN|MSYS|QURT)$") | ||
+ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Android|Darwin|iOS|Linux|Windows|CYGWIN|MSYS|QURT|Emscripten|iOS)$") | ||
MESSAGE(FATAL_ERROR "Unrecognized CMAKE_SYSTEM_NAME value \"${CMAKE_SYSTEM_NAME}\"") | ||
ENDIF() | ||
|
||
# ---[ Download deps | ||
IF(NOT XNNPACK_USE_SYSTEM_LIBS) | ||
- IF(NOT DEFINED CLOG_SOURCE_DIR) | ||
- MESSAGE(STATUS "Downloading clog to ${CMAKE_BINARY_DIR}/clog-source (define CLOG_SOURCE_DIR to avoid it)") | ||
- CONFIGURE_FILE(cmake/DownloadCLog.cmake "${CMAKE_BINARY_DIR}/clog-download/CMakeLists.txt") | ||
- EXECUTE_PROCESS(COMMAND "${CMAKE_COMMAND}" -G "${CMAKE_GENERATOR}" . | ||
- WORKING_DIRECTORY "${CMAKE_BINARY_DIR}/clog-download") | ||
- EXECUTE_PROCESS(COMMAND "${CMAKE_COMMAND}" --build . | ||
- WORKING_DIRECTORY "${CMAKE_BINARY_DIR}/clog-download") | ||
- SET(CLOG_SOURCE_DIR "${CMAKE_BINARY_DIR}/clog-source" CACHE STRING "clog source directory") | ||
- ENDIF() | ||
- | ||
IF(NOT DEFINED CPUINFO_SOURCE_DIR) | ||
MESSAGE(STATUS "Downloading cpuinfo to ${CMAKE_BINARY_DIR}/cpuinfo-source (define CPUINFO_SOURCE_DIR to avoid it)") | ||
CONFIGURE_FILE(cmake/DownloadCpuinfo.cmake "${CMAKE_BINARY_DIR}/cpuinfo-download/CMakeLists.txt") | ||
@@ -7108,6 +7098,10 @@ IF(MSVC) | ||
SET_PROPERTY(SOURCE ${ALL_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O2 >") | ||
SET_PROPERTY(SOURCE ${HOT_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O2 >") | ||
SET_PROPERTY(SOURCE ${COLD_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O1 >") | ||
+ELSEIF(CMAKE_GENERATOR STREQUAL Xcode) | ||
+ TARGET_COMPILE_OPTIONS(all_microkernels PRIVATE $<$<NOT:$<CONFIG:Debug>>: -O2 >) | ||
+ TARGET_COMPILE_OPTIONS(XNNPACK PRIVATE $<$<NOT:$<CONFIG:Debug>>: -O2 >) | ||
+ TARGET_COMPILE_OPTIONS(XNNPACK PRIVATE $<$<NOT:$<CONFIG:Debug>>: -Os >) | ||
ELSE() | ||
SET_PROPERTY(SOURCE ${ALL_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: -O2 >") | ||
SET_PROPERTY(SOURCE ${HOT_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: -O2 >") | ||
@@ -7142,26 +7136,6 @@ IF(LIBM) | ||
TARGET_LINK_LIBRARIES(indirection PRIVATE ${LIBM}) | ||
IF(CMAKE_SYSTEM_NAME MATCHES "Windows") | ||
@@ -534,7 +534,12 @@ IF(XNNPACK_BUILD_LIBRARY) | ||
TARGET_LINK_LIBRARIES(operator-utils PRIVATE logging) | ||
TARGET_LINK_LIBRARIES(post-operation PRIVATE logging) | ||
TARGET_LINK_LIBRARIES(subgraph PRIVATE allocator logging memory mutex operators operator-run) | ||
- TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation microkernels-prod subgraph) | ||
+ IF(CMAKE_SYSTEM_NAME STREQUAL "Emscripten") | ||
+ # omit microkernels-prod as the list is manually created by ORT in cmake/external/xnnpack.cmake | ||
+ TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation subgraph) | ||
+ ELSE() | ||
+ TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation microkernels-prod subgraph) | ||
+ ENDIF() | ||
SET_TARGET_PROPERTIES(XNNPACK PROPERTIES C_EXTENSIONS YES) | ||
ENDIF() | ||
|
||
-# ---[ Configure clog | ||
-IF(NOT TARGET clog) | ||
- IF(NOT XNNPACK_USE_SYSTEM_LIBS) | ||
- SET(CLOG_BUILD_TESTS OFF CACHE BOOL "") | ||
- SET(CLOG_RUNTIME_TYPE "${CPUINFO_RUNTIME_TYPE}" CACHE STRING "") | ||
- ADD_SUBDIRECTORY( | ||
- "${CLOG_SOURCE_DIR}/deps/clog" | ||
- "${CMAKE_BINARY_DIR}/clog") | ||
- # We build static version of clog but a dynamic library may indirectly depend on it | ||
- SET_PROPERTY(TARGET clog PROPERTY POSITION_INDEPENDENT_CODE ON) | ||
- ELSE() | ||
- ADD_LIBRARY(clog STATIC IMPORTED) | ||
- FIND_LIBRARY(CLOG_LIBRARY clog) | ||
- IF(NOT CLOG_LIBRARY) | ||
- MESSAGE(FATAL_ERROR "Cannot find clog") | ||
- ENDIF() | ||
- SET_PROPERTY(TARGET clog PROPERTY IMPORTED_LOCATION "${CLOG_LIBRARY}") | ||
- ENDIF() | ||
-ENDIF() | ||
- | ||
# ---[ Configure cpuinfo | ||
IF(NOT TARGET cpuinfo) | ||
IF(NOT XNNPACK_USE_SYSTEM_LIBS) | ||
IF(NOT MSVC) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.