-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for extended memops #59
Conversation
…atch_ops in gds_stream_post_descriptors
…s_stream_post_descriptors in gds_poll_lat
…eersync mode (still not working)
…ng for new memops in --enable-extended-memops section.
…ds_fill_membar, assuming mlx5 need an issue to track this assumption!
…ng code and make it per-device. abstract per-device max_batch_size.
…ing before QP/CQ creation fixes #60
Test on DGX-1V (lab12) with regular driver 410.02 (http://linuxqa/builds/release/display/x86_64/410.02 ) and both CUDA 9.0 / 9.2. Test without extmemops, CUDA 9.2:
Test with extmemops, CUDA 9.2:
Same results for CUDA 9.0. Same results on brdw0 with GPU P100, driver 410.02 and new installed CUDA 9.2. |
gds_kernel_latency works for me on ivy2/3 with: my test script is ~drossetti/.../peersync/src/libgdsync/gds_kernel_latency.sh for example:
|
On ivy2/3 there is a Kepler (and MLNX_OFED_LINUX-4.2-1.0.0.0) while on DGX there is a Volta (and MLNX_OFED_LINUX-4.3-1.0.1.0) and the difference seems that the NOR op on Volta is enabled and used by libgdsync. Disabling the NOR
|
note that parameters name for both gds_kernel tests have changed, e.g. -K vs -k |
…RITE_MEMORY is not available
NOR should be enabled on master already, so let's treat this as a separate problem. filed #68 to track progress on that. |
Tested on DGX-1V lab12, r410.04 without new memops, most updated mlnx fw. |
Tested on DGX-1V with CUDA 9.2 and r410.04 (no memops). |
support for experimental APIs.
it must build without those if --enable-extended-memops is not passed in build.sh