-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] refactor #6408
[SYCL] refactor #6408
Conversation
@slaren Since we can put SYCL related code under a directory instead of a single file, I might introduce headers-only library for performance optimization, as well as simplifying our effort too (my job during work time 😁 ) @ggerganov @mingfeima for aware |
I think that's good, I plan to start using CUTLASS in the CUDA backend as well. |
It's great to see the new structure. |
This is a good refactoring, and would be helpful for debug . I would suggest waiting for some iq quant prs and then resume work on this. |
yes, drop a note when you finished. |
|
for sub folder: dpct
|
There was a lot of changes which caused conflict. Unable to see how to easily resolve it. @airMeng can you see if there is much that needs to be fixed? |
never mind. just more time needed. |
@NeoZhangJianyu @luoyu-intel @AidanBeltonS refactored after #7710, please have a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why still keep some kernels in the ggml-sycl.cpp file?
ggml-sycl.h
Outdated
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
|
||
#define GGML_SYCL_MAX_DEVICES 48 | ||
#define GGML_SYCL_NAME "SYCL" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two macro would be used by external caller.
So, don't move to presets.hpp.
Suggest all exports function/variable/macro are defined in ggml-sycl.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ping @joeatodd @ggerganov for a look when available.
according to #5277 (reply in thread), the PR does the following:
- [ ] let the common backend to handle H2D/D2H memcpy.let the PR as simple as possible