Skip to content

Commit

Permalink
Update XNNPACK to latest version (#18038)
Browse files Browse the repository at this point in the history
### Description
<!-- Describe your changes. -->
Update XNNPACK to latest version
- adds fp16 kernels and various other improvements
- requires pthreadpool update as well

Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API
- 'setup' is split into 'reshape' and 'setup'
-  some ops use a workspace buffer
   -  copied workspace allocation from XNNPACK unit test code
- some suffixes changed 

Added wrapper for XNNPACK caches to base XNNPACK EP kernel
- simplifies usage
- XNNPACK split out the code and weights caches, but the code cache
isn't currently usable via the public API
- we could use the internal types if we think it's required for
performance reasons. non-trivial though as we'd need to propagate ifdef
values from the XNNPACK build up to the ORT build.
- using XNNPACK internals would also mean we would not be able to
support using a pre-build XNNPACK package
    - not an issue currently
  
Fixed opset registration for internal NHWC domain
- was not being tied to the ONNX version, so nodes inserted by layout
transformation had the incorrect opset
- a number of other places needed updating once this issue was fixed

Remove support for NCHW Resize from XNNPACK EP so it's NHWC only
- we only supported NCHW for fp32,
- doing so adds complexity in multiple places (XNNPACK EP kernel
implementation, layout transformation and transpose optimization)
- unclear if that complexity provides any benefit. can add back if
required by production scenario

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
We're looking at enabling fp16 support for CoreML and NNAPI. If we do
that we need a good fallback story if the CPU EP will be used. The
XNNPACK fp16 kernels will hopefully provide that.

NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That
can be done as required in separate EPs and should be relatively simple
to do.
  • Loading branch information
skottmckay authored Nov 3, 2023
1 parent e36d003 commit 4f2096b
Show file tree
Hide file tree
Showing 45 changed files with 794 additions and 517 deletions.
11 changes: 8 additions & 3 deletions cmake/deps.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,15 @@
#The columns are separated by ";" because a list in cmake is just a ";" separated group of strings.
#Names should be in lower case. They will be used as variable names in cmake.
#URLs can be either https URLs or local file paths in cmake-style(directory separator is a forward slash character).
#SHA1 hashes can be generated by running sha1sum command.
#SHA1 hashes can be generated by running sha1sum command on linux. PowerShell can also be used:
# (Get-FileHash -Algorithm SHA1 <filename>).Hash.ToLower()
#If you need to change abseil's version to a different one, you may also want to update external\abseil-cpp.natvis
#since the file contains a version string: "lts_20230802". However, the file is for debugging purposes only and would
#not affect built binaries.
#
# NOTE: You must run deps_update_and_upload.py when ready to test your changes in a CI.
# See https://microsoft.sharepoint.com/teams/ONNX2/_layouts/OneNote.aspx?id=%2Fteams%2FONNX2%2FShared%20Documents%2FNotebooks%2FONNX%20Ecosystem%20Team%20Notebook&wd=target%28Development.one%7C63D3AB47-51D1-4A62-9965-66882234BD44%2FAdd%20or%20update%20a%20dependency%20in%20deps.txt%7C0E9ED71D-89D5-40FA-B05F-C0123289C591%2F%29
#
abseil_cpp;https://github.com/abseil/abseil-cpp/archive/refs/tags/20230802.0.zip;04271dfbfac59269b6939e1e9d5faf0d18a7ba91
cxxopts;https://github.com/jarro2783/cxxopts/archive/3c73d91c0b04e2b59462f0a741be8c07024c1bc0.zip;6c6ca7f8480b26c8d00476e0e24b7184717fe4f0
date;https://github.com/HowardHinnant/date/archive/refs/tags/v3.0.1.zip;2dac0c81dc54ebdd8f8d073a75c053b04b56e159
Expand All @@ -18,7 +23,7 @@ fxdiv;https://github.com/Maratyszcza/FXdiv/archive/63058eff77e11aa15bf531df5dd34
google_benchmark;https://github.com/google/benchmark/archive/refs/tags/v1.7.0.zip;e97c368b176e8614e3f1bf13dd9abcf6a7ad9908
google_nsync;https://github.com/google/nsync/archive/refs/tags/1.26.0.zip;5e7c00ef6bf5b787386fc040067903ec774e2752
googletest;https://github.com/google/googletest/archive/refs/tags/v1.14.0.zip;0ac421f2ec11af38b0fff0f1992184032731a8bc
googlexnnpack;https://github.com/google/XNNPACK/archive/003c580e696a774afdc984996ee909b7c8d8128c.zip;9f192e3f15e1e37ae9c78d53eeea47e45c5eb31c
googlexnnpack;https://github.com/google/XNNPACK/archive/0da379fc4808f9601faef392352018c741c0f297.zip;663883491e380b628e0a5b162b5f2658032fae73
json;https://github.com/nlohmann/json/archive/refs/tags/v3.10.5.zip;f257f8dc27c5b8c085dc887b40cddd18ae1f725c
microsoft_gsl;https://github.com/microsoft/GSL/archive/refs/tags/v4.0.0.zip;cf368104cd22a87b4dd0c80228919bb2df3e2a14
microsoft_wil;https://github.com/microsoft/wil/archive/refs/tags/v1.0.230629.1.zip;e4a542a323c070376f7c2d1973d0f7ddbc1d2fa5
Expand All @@ -35,7 +40,7 @@ protoc_linux_x86;https://github.com/protocolbuffers/protobuf/releases/download/v
protoc_linux_aarch64;https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-linux-aarch_64.zip;df9d45470b0b8cf939dd2f0ec6b88e9cafc4d617
protoc_mac_universal;https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-osx-universal_binary.zip;23710c3d1c2036d8d65a6a22234372fa2d7af9ef
psimd;https://github.com/Maratyszcza/psimd/archive/072586a71b55b7f8c584153d223e95687148a900.zip;1f5454b01f06f9656b77e4a5e2e31d7422487013
pthreadpool;https://github.com/Maratyszcza/pthreadpool/archive/1787867f6183f056420e532eec640cba25efafea.zip;e43e80781560c5ab404a4da20f34d846f5f5d101
pthreadpool;https://github.com/Maratyszcza/pthreadpool/archive/4fe0e1e183925bf8cfa6aae24237e724a96479b8.zip;07a0aa91dd9bf86f31b95497e00f31d8a261a4bd
pybind11;https://github.com/pybind/pybind11/archive/refs/tags/v2.10.1.zip;769b6aa67a77f17a770960f604b727645b6f6a13
pytorch_cpuinfo;https://github.com/pytorch/cpuinfo/archive/959002f82d7962a473d8bf301845f2af720e0aa4.zip;85da3caa60eb2b148613b443fbc2bfdc30689965
re2;https://github.com/google/re2/archive/refs/tags/2022-06-01.zip;aa77313b76e91b531ee7f3e45f004c6a502a5374
Expand Down
40 changes: 22 additions & 18 deletions cmake/external/xnnpack.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,23 @@ set(FXDIV_SOURCE_DIR ${fxdiv_SOURCE_DIR})

FetchContent_Declare(pthreadpool URL ${DEP_URL_pthreadpool} URL_HASH SHA1=${DEP_SHA1_pthreadpool})
onnxruntime_fetchcontent_makeavailable(pthreadpool)
FetchContent_Declare(googlexnnpack URL ${DEP_URL_googlexnnpack} URL_HASH SHA1=${DEP_SHA1_googlexnnpack}
PATCH_COMMAND ${Patch_EXECUTABLE} --binary --ignore-whitespace -p1 < ${PROJECT_SOURCE_DIR}/patches/xnnpack/AddEmscriptenAndIosSupport.patch)

FetchContent_Declare(googlexnnpack URL ${DEP_URL_googlexnnpack} URL_HASH SHA1=${DEP_SHA1_googlexnnpack}
PATCH_COMMAND ${Patch_EXECUTABLE} --binary --ignore-whitespace -p1 < ${PROJECT_SOURCE_DIR}/patches/xnnpack/AddEmscriptenAndIosSupport.patch
)
onnxruntime_fetchcontent_makeavailable(googlexnnpack)
set(XNNPACK_DIR ${googlexnnpack_SOURCE_DIR})
set(XNNPACK_INCLUDE_DIR ${XNNPACK_DIR}/include)

set(onnxruntime_EXTERNAL_LIBRARIES_XNNPACK XNNPACK pthreadpool)


# the XNNPACK CMake setup doesn't include the WASM kernels so we have to manually set those up
if(CMAKE_SYSTEM_NAME STREQUAL "Emscripten")
# See source lists in _deps/googlexnnpack-src/BUILD.bazel for wasm_prod_microkernels
message("Adding WebAssembly Source Files to XNNPACK")
set(wasm_srcs "")

file(READ "${XNNPACK_DIR}/BUILD.bazel" xnnpack_bazel_config)

# Replace newlines with semicolon so that it is treated as a list by CMake
Expand Down Expand Up @@ -70,25 +76,23 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Emscripten")
set(${target_srcs} ${bazel_srcs} PARENT_SCOPE)
endfunction()

GetSrcListFromBazel("PROD_SCALAR_WASM_MICROKERNEL_SRCS" prod_scalar_wasm_srcs)
GetSrcListFromBazel("ALL_WASM_MICROKERNEL_SRCS" all_wasm_srcs)
GetSrcListFromBazel("WASM32_ASM_MICROKERNEL_SRCS" wasm32_asm_srcs)
GetSrcListFromBazel("OPERATOR_SRCS" operator_srcs)
GetSrcListFromBazel("TABLE_SRCS" table_srcs)
list(APPEND wasm_srcs ${operator_srcs} ${table_srcs})

message(DEBUG "prod_scalar_wasm_srcs: ${prod_scalar_wasm_srcs}\n")
message(DEBUG "all_wasm_srcs: ${all_wasm_srcs}\n")
message(DEBUG "wasm32_asm_srcs: ${wasm32_asm_srcs}\n")
# kernels
list(APPEND wasm_srcs ${XNNPACK_DIR}/src/amalgam/gen/scalar.c)
list(APPEND wasm_srcs ${XNNPACK_DIR}/src/amalgam/gen/wasm.c)

message("Adding WebAssembly Source Files to XNNPACK")
set(wasm_srcs "")
list(APPEND wasm_srcs ${prod_scalar_wasm_srcs})
list(APPEND wasm_srcs ${all_wasm_srcs})
list(APPEND wasm_srcs ${wasm32_asm_srcs})
if(onnxruntime_ENABLE_WEBASSEMBLY_SIMD)
list(APPEND wasm_srcs ${XNNPACK_DIR}/src/amalgam/gen/wasmsimd.c)
target_compile_options(XNNPACK PRIVATE "-msimd128")
endif()

message(DEBUG "wasm_srcs: ${wasm_srcs}\n")
target_sources(XNNPACK PRIVATE ${wasm_srcs})

if(onnxruntime_ENABLE_WEBASSEMBLY_SIMD)
GetSrcListFromBazel("ALL_WASMSIMD_MICROKERNEL_SRCS" all_wasmsimd_srcs)
message(DEBUG "all_wasmsimd_srcs: ${all_wasmsimd_srcs}")
target_sources(XNNPACK PRIVATE ${all_wasmsimd_srcs})
endif()
# add flags from BAZEL.build
target_compile_options(XNNPACK PRIVATE "-fno-fast-math")
target_compile_options(XNNPACK PRIVATE "-fno-math-errno")
endif()
5 changes: 3 additions & 2 deletions cmake/onnxruntime_providers_xnnpack.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
source_group(TREE ${REPO_ROOT} FILES ${onnxruntime_providers_xnnpack_cc_srcs})
onnxruntime_add_static_library(onnxruntime_providers_xnnpack ${onnxruntime_providers_xnnpack_cc_srcs})
onnxruntime_add_include_to_target(onnxruntime_providers_xnnpack
onnxruntime_common onnxruntime_framework onnx onnx_proto ${PROTOBUF_LIB} XNNPACK pthreadpool flatbuffers::flatbuffers Boost::mp11 safeint_interface
onnxruntime_common onnxruntime_framework onnx onnx_proto ${PROTOBUF_LIB} XNNPACK pthreadpool
flatbuffers::flatbuffers Boost::mp11 safeint_interface
)

add_dependencies(onnxruntime_providers_xnnpack onnx ${onnxruntime_EXTERNAL_DEPENDENCIES})
Expand All @@ -35,4 +36,4 @@
# there are some in builds where sizeof(size_t) != sizeof(int64_t), e.g., in 'ONNX Runtime Web CI Pipeline'
if (HAS_SHORTEN_64_TO_32 AND NOT CMAKE_SIZEOF_VOID_P EQUAL 8)
target_compile_options(onnxruntime_providers_xnnpack PRIVATE -Wno-error=shorten-64-to-32)
endif()
endif()
14 changes: 12 additions & 2 deletions cmake/onnxruntime_unittests.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ function(AddTest)
if (MSVC)
target_compile_options(${_UT_TARGET} PRIVATE "$<$<COMPILE_LANGUAGE:CUDA>:SHELL:--compiler-options /wd6330>"
"$<$<NOT:$<COMPILE_LANGUAGE:CUDA>>:/wd6330>")
#Abseil has a lot of C4127/C4324 warnings.
#Abseil has a lot of C4127/C4324 warnings.
target_compile_options(${_UT_TARGET} PRIVATE "$<$<COMPILE_LANGUAGE:CUDA>:SHELL:--compiler-options /wd4127>"
"$<$<NOT:$<COMPILE_LANGUAGE:CUDA>>:/wd4127>")
target_compile_options(${_UT_TARGET} PRIVATE "$<$<COMPILE_LANGUAGE:CUDA>:SHELL:--compiler-options /wd4324>"
Expand Down Expand Up @@ -201,8 +201,18 @@ function(AddTest)
list(APPEND TEST_NODE_FLAGS "--experimental-wasm-simd")
endif()

# prefer Node from emsdk so the version is more deterministic
if (DEFINED ENV{EMSDK_NODE})
set(NODE_EXECUTABLE $ENV{EMSDK_NODE})
else()
# warning as we don't know what node version is being used and whether things like the TEST_NODE_FLAGS
# will be valid. e.g. "--experimental-wasm-simd" is not valid with node v20 or later.
message(WARNING "EMSDK_NODE environment variable was not set. Falling back to system `node`.")
set(NODE_EXECUTABLE node)
endif()

add_test(NAME ${_UT_TARGET}
COMMAND node ${TEST_NODE_FLAGS} ${_UT_TARGET}.js ${TEST_ARGS}
COMMAND ${NODE_EXECUTABLE} ${TEST_NODE_FLAGS} ${_UT_TARGET}.js ${TEST_ARGS}
WORKING_DIRECTORY $<TARGET_FILE_DIR:${_UT_TARGET}>
)
endif()
Expand Down
8 changes: 6 additions & 2 deletions cmake/onnxruntime_webassembly.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,13 @@ else()
onnxruntime_util
re2::re2
)

set(EXPORTED_RUNTIME_METHODS "'stackAlloc','stackRestore','stackSave','UTF8ToString','stringToUTF8','lengthBytesUTF8'")

if (onnxruntime_USE_XNNPACK)
target_link_libraries(onnxruntime_webassembly PRIVATE XNNPACK)
string(APPEND EXPORTED_RUNTIME_METHODS ",'addFunction'")
target_link_options(onnxruntime_webassembly PRIVATE "SHELL:-s ALLOW_TABLE_GROWTH=1")
endif()

if(onnxruntime_USE_WEBNN)
Expand All @@ -204,15 +209,14 @@ else()
target_link_libraries(onnxruntime_webassembly PRIVATE tensorboard)
endif()

set(EXPORTED_RUNTIME_METHODS "['stackAlloc','stackRestore','stackSave','UTF8ToString','stringToUTF8','lengthBytesUTF8']")
if (onnxruntime_USE_JSEP)
set(EXPORTED_FUNCTIONS "_malloc,_free,_JsepOutput,_JsepGetNodeName")
else()
set(EXPORTED_FUNCTIONS "_malloc,_free")
endif()

target_link_options(onnxruntime_webassembly PRIVATE
"SHELL:-s EXPORTED_RUNTIME_METHODS=${EXPORTED_RUNTIME_METHODS}"
"SHELL:-s EXPORTED_RUNTIME_METHODS=[${EXPORTED_RUNTIME_METHODS}]"
"SHELL:-s EXPORTED_FUNCTIONS=${EXPORTED_FUNCTIONS}"
"SHELL:-s MAXIMUM_MEMORY=4294967296"
"SHELL:-s EXIT_RUNTIME=0"
Expand Down
79 changes: 20 additions & 59 deletions cmake/patches/xnnpack/AddEmscriptenAndIosSupport.patch
Original file line number Diff line number Diff line change
@@ -1,66 +1,27 @@
diff --git a/CMakeLists.txt b/CMakeLists.txt
index d53c48aa1..77c3cf983 100755
index dba9b4687..bcaa18ad7 100755
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -105,22 +105,12 @@ ENDIF()

@@ -122,7 +122,7 @@ ENDIF()
# ---[ Build flags
IF(NOT CMAKE_SYSTEM_NAME)
MESSAGE(FATAL_ERROR "CMAKE_SYSTEM_NAME not defined")
-ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Darwin|Linux|Android|Windows|CYGWIN|MSYS)$")
+ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Darwin|Linux|Android|Windows|CYGWIN|MSYS|Emscripten|iOS)$")
MESSAGE(FATAL_ERROR "Unrecognized CMAKE_SYSTEM_NAME = ${CMAKE_SYSTEM_NAME}")
-ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Android|Darwin|iOS|Linux|Windows|CYGWIN|MSYS|QURT)$")
+ELSEIF(NOT CMAKE_SYSTEM_NAME MATCHES "^(Android|Darwin|iOS|Linux|Windows|CYGWIN|MSYS|QURT|Emscripten|iOS)$")
MESSAGE(FATAL_ERROR "Unrecognized CMAKE_SYSTEM_NAME value \"${CMAKE_SYSTEM_NAME}\"")
ENDIF()

# ---[ Download deps
IF(NOT XNNPACK_USE_SYSTEM_LIBS)
- IF(NOT DEFINED CLOG_SOURCE_DIR)
- MESSAGE(STATUS "Downloading clog to ${CMAKE_BINARY_DIR}/clog-source (define CLOG_SOURCE_DIR to avoid it)")
- CONFIGURE_FILE(cmake/DownloadCLog.cmake "${CMAKE_BINARY_DIR}/clog-download/CMakeLists.txt")
- EXECUTE_PROCESS(COMMAND "${CMAKE_COMMAND}" -G "${CMAKE_GENERATOR}" .
- WORKING_DIRECTORY "${CMAKE_BINARY_DIR}/clog-download")
- EXECUTE_PROCESS(COMMAND "${CMAKE_COMMAND}" --build .
- WORKING_DIRECTORY "${CMAKE_BINARY_DIR}/clog-download")
- SET(CLOG_SOURCE_DIR "${CMAKE_BINARY_DIR}/clog-source" CACHE STRING "clog source directory")
- ENDIF()
-
IF(NOT DEFINED CPUINFO_SOURCE_DIR)
MESSAGE(STATUS "Downloading cpuinfo to ${CMAKE_BINARY_DIR}/cpuinfo-source (define CPUINFO_SOURCE_DIR to avoid it)")
CONFIGURE_FILE(cmake/DownloadCpuinfo.cmake "${CMAKE_BINARY_DIR}/cpuinfo-download/CMakeLists.txt")
@@ -7108,6 +7098,10 @@ IF(MSVC)
SET_PROPERTY(SOURCE ${ALL_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O2 >")
SET_PROPERTY(SOURCE ${HOT_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O2 >")
SET_PROPERTY(SOURCE ${COLD_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: /O1 >")
+ELSEIF(CMAKE_GENERATOR STREQUAL Xcode)
+ TARGET_COMPILE_OPTIONS(all_microkernels PRIVATE $<$<NOT:$<CONFIG:Debug>>: -O2 >)
+ TARGET_COMPILE_OPTIONS(XNNPACK PRIVATE $<$<NOT:$<CONFIG:Debug>>: -O2 >)
+ TARGET_COMPILE_OPTIONS(XNNPACK PRIVATE $<$<NOT:$<CONFIG:Debug>>: -Os >)
ELSE()
SET_PROPERTY(SOURCE ${ALL_MICROKERNEL_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: -O2 >")
SET_PROPERTY(SOURCE ${HOT_SRCS} APPEND_STRING PROPERTY COMPILE_FLAGS "$<$<NOT:$<CONFIG:Debug>>: -O2 >")
@@ -7142,26 +7136,6 @@ IF(LIBM)
TARGET_LINK_LIBRARIES(indirection PRIVATE ${LIBM})
IF(CMAKE_SYSTEM_NAME MATCHES "Windows")
@@ -534,7 +534,12 @@ IF(XNNPACK_BUILD_LIBRARY)
TARGET_LINK_LIBRARIES(operator-utils PRIVATE logging)
TARGET_LINK_LIBRARIES(post-operation PRIVATE logging)
TARGET_LINK_LIBRARIES(subgraph PRIVATE allocator logging memory mutex operators operator-run)
- TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation microkernels-prod subgraph)
+ IF(CMAKE_SYSTEM_NAME STREQUAL "Emscripten")
+ # omit microkernels-prod as the list is manually created by ORT in cmake/external/xnnpack.cmake
+ TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation subgraph)
+ ELSE()
+ TARGET_LINK_LIBRARIES(XNNPACK PRIVATE allocator cache hardware-config indirection jit logging memory microkernel-utils microparams-init mutex normalization operators operator-run operator-utils packing post-operation microkernels-prod subgraph)
+ ENDIF()
SET_TARGET_PROPERTIES(XNNPACK PROPERTIES C_EXTENSIONS YES)
ENDIF()

-# ---[ Configure clog
-IF(NOT TARGET clog)
- IF(NOT XNNPACK_USE_SYSTEM_LIBS)
- SET(CLOG_BUILD_TESTS OFF CACHE BOOL "")
- SET(CLOG_RUNTIME_TYPE "${CPUINFO_RUNTIME_TYPE}" CACHE STRING "")
- ADD_SUBDIRECTORY(
- "${CLOG_SOURCE_DIR}/deps/clog"
- "${CMAKE_BINARY_DIR}/clog")
- # We build static version of clog but a dynamic library may indirectly depend on it
- SET_PROPERTY(TARGET clog PROPERTY POSITION_INDEPENDENT_CODE ON)
- ELSE()
- ADD_LIBRARY(clog STATIC IMPORTED)
- FIND_LIBRARY(CLOG_LIBRARY clog)
- IF(NOT CLOG_LIBRARY)
- MESSAGE(FATAL_ERROR "Cannot find clog")
- ENDIF()
- SET_PROPERTY(TARGET clog PROPERTY IMPORTED_LOCATION "${CLOG_LIBRARY}")
- ENDIF()
-ENDIF()
-
# ---[ Configure cpuinfo
IF(NOT TARGET cpuinfo)
IF(NOT XNNPACK_USE_SYSTEM_LIBS)
IF(NOT MSVC)
10 changes: 5 additions & 5 deletions js/web/docs/webgpu-operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,15 @@ Do not modify directly.*
| Asinh | ai.onnx(9+) | |
| Atan | ai.onnx(7+) | |
| Atanh | ai.onnx(9+) | |
| AveragePool | ai.onnx(7-9,10,11+); com.ms.internal.nhwc(11+) | need perf optimization; need implementing activation |
| AveragePool | ai.onnx(7-9,10,11+); com.ms.internal.nhwc(7-9,10,11+) | need perf optimization; need implementing activation |
| BiasAdd | com.microsoft(1+) | |
| BiasSplitGelu | com.microsoft(1+) | |
| Cast | ai.onnx(6-8,9-12,13-18,19+) | |
| Ceil | ai.onnx(6-12,13+) | |
| Clip | ai.onnx(6-10,11,12,13+) | |
| Concat | ai.onnx(1-3,4-10,11-12,13+) | |
| Conv | ai.onnx(1-10,11+); com.ms.internal.nhwc(11+) | need perf optimization; conv3d is not supported; need implementing activation |
| ConvTranspose | ai.onnx(1-10,11+); com.ms.internal.nhwc(11+) | need perf optimization; ConvTranspose3d is not supported; need implementing activation |
| Conv | ai.onnx(1-10,11+); com.ms.internal.nhwc(1-10,11+) | need perf optimization; conv3d is not supported; need implementing activation |
| ConvTranspose | ai.onnx(1-10,11+); com.ms.internal.nhwc(1-10,11+) | need perf optimization; ConvTranspose3d is not supported; need implementing activation |
| Cos | ai.onnx(7+) | |
| Cosh | ai.onnx(9+) | |
| Div | ai.onnx(7-12,13,14+) | |
Expand Down Expand Up @@ -57,7 +57,7 @@ Do not modify directly.*
| LessOrEqual | ai.onnx(12-15,16+) | |
| Log | ai.onnx(6-12,13+) | |
| MatMul | ai.onnx(1-12,13+) | |
| MaxPool | ai.onnx(1-7,8-9,10,11,12+); com.ms.internal.nhwc(11,12+) | need perf optimization; need implementing activation |
| MaxPool | ai.onnx(1-7,8-9,10,11,12+); com.ms.internal.nhwc(1-7,8-9,10,11,12+) | need perf optimization; need implementing activation |
| MemcpyFromHost | ai.onnx(1+) | |
| MemcpyToHost | ai.onnx(1+) | |
| Mul | ai.onnx(7-12,13,14+) | |
Expand All @@ -79,7 +79,7 @@ Do not modify directly.*
| ReduceSumSquare | ai.onnx(1-10,11-12,13-17,18+) | |
| Relu | ai.onnx(6-12,13,14+) | |
| Reshape | ai.onnx(5-12,13,14+) | no GPU kernel |
| Resize | ai.onnx(10,11-12,13-17,18,19+); com.ms.internal.nhwc(11-12,13-17,18,19+) | CoordinateTransformMode align_corners is not supported with downsampling |
| Resize | ai.onnx(10,11-12,13-17,18,19+); com.ms.internal.nhwc(10,11-12,13-17,18,19+) | CoordinateTransformMode align_corners is not supported with downsampling |
| Shape | ai.onnx(1-12,13-14,15+) | no GPU kernel; an ORT warning is generated - need to fix |
| Sigmoid | ai.onnx(6-12,13+) | |
| Sin | ai.onnx(7+) | |
Expand Down
5 changes: 4 additions & 1 deletion js/web/test/test-runner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,10 @@ async function initializeSession(
session = await ort.InferenceSession.create(modelFilePath, sessionConfig);
}
} catch (e) {
Logger.error('TestRunner', `Failed to load model from file: ${modelFilePath}. Error: ${inspect(e)}`);
Logger.error(
'TestRunner',
`Failed to load model from file: ${modelFilePath}. ` +
`Error: ${e.message} @ ${e.fileName}:${e.lineNumber}`);
throw e;
}

Expand Down
9 changes: 7 additions & 2 deletions onnxruntime/core/framework/kernel_registry_manager.cc
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,13 @@ Status KernelRegistryManager::SearchKernelRegistry(const Node& node,

auto create_error_message = [&node, &status](const std::string& prefix) {
std::ostringstream errormsg;
errormsg << prefix << node.OpType() << "(" << node.SinceVersion() << ")";
errormsg << " (node:'" << node.Name() << "' ep:'" << node.GetExecutionProviderType() << "'). ";
errormsg << prefix;
const auto& domain = node.Domain();
if (!domain.empty()) {
errormsg << domain << ".";
}
errormsg << node.OpType() << "(" << node.SinceVersion() << ")"
<< " (node:'" << node.Name() << "' ep:'" << node.GetExecutionProviderType() << "'). ";
if (!status.IsOK())
errormsg << status.ErrorMessage();

Expand Down
Loading

0 comments on commit 4f2096b

Please sign in to comment.