Skip to content

Commit

Permalink
[SYCL] Fix the sub group size of Intel (ggerganov#8106)
Browse files Browse the repository at this point in the history
* use warp_size macro for all sycl kernels

* fix mask of permute_sub_group_by_xor

* fix rms_norm with correct warp number

* fix rms_norm_f32/group_norm_f32

* move norm to norm.cpp file

* fix quantize bug

* fix mmvq's batch size
  • Loading branch information
luoyu-intel authored Jul 2, 2024
1 parent 5fac350 commit d08c20e
Show file tree
Hide file tree
Showing 9 changed files with 587 additions and 509 deletions.
4 changes: 3 additions & 1 deletion ggml/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -486,9 +486,11 @@ if (GGML_SYCL)
add_compile_options(-I./) #include DPCT

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-narrowing")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3")
if (GGML_SYCL_TARGET STREQUAL "NVIDIA")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl-targets=nvptx64-nvidia-cuda")
add_compile_definitions(GGML_SYCL_WARP_SIZE=32)
else()
add_compile_definitions(GGML_SYCL_WARP_SIZE=16)
endif()

file(GLOB GGML_HEADERS_SYCL "ggml-sycl/*.hpp")
Expand Down
Loading

0 comments on commit d08c20e

Please sign in to comment.