Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add device config file consistency test #16369

Open
wants to merge 20 commits into
base: sycl
Choose a base branch
from

Conversation

jzc
Copy link
Contributor

@jzc jzc commented Dec 13, 2024

No description provided.

@jzc jzc requested review from a team as code owners December 13, 2024 23:06
@jzc jzc requested a review from maarquitos14 December 13, 2024 23:06
@jzc jzc temporarily deployed to WindowsCILock December 13, 2024 23:07 — with GitHub Actions Inactive
@jzc jzc requested a review from a team as a code owner December 19, 2024 18:04
.github/workflows/sycl-linux-build.yml Outdated Show resolved Hide resolved
@@ -532,6 +532,11 @@ if("hip" IN_LIST SYCL_ENABLE_BACKENDS)
list(APPEND SYCL_TOOLCHAIN_DEPLOY_COMPONENTS ur_adapter_hip)
endif()

if(INSTALL_DEVICE_CONFIG_FILE)
add_dependencies(sycl-toolchain DeviceConfigFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give some background on why we need to install this? thx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still testing things, so I might change some things, but I want to support the new device_config_file_consistency test. It uses the DeviceConfigFile.hpp, and that file includes DeviceConfigFile.inc, which is generated by tablegen. On CI from what I understand the e2e tests are invoked by using the packed install files from the build step and only runs CMake on the sycl/test-e2e subfolder. So since we probably don't want to build tablegen and invoke other LLVM cmake files when running the e2e tests, I install the DeviceConfigFile.inc it in the build step to pass it to the e2e tests. Also note that this test must be an e2e test as it queries the device it is running the test on, so it can't be moved to sycl/test.

Copy link
Contributor

@sarnex sarnex Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry so what tools/files are required to generate that hpp file? Is it just llvm-tablegen and DeviceConfigFile.inc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea just tablegen and DeviceConfigFile.td are needed for the hpp file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just for some more background I was aiming so that DeviceConfigFile.hpp is not installed by default because outside of testing, this file is not needed for a SYCL distribution, it is only used in the compiler.

.github/workflows/sycl-linux-build.yml Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/device_config_file_consistency.cpp Outdated Show resolved Hide resolved
@jzc jzc temporarily deployed to WindowsCILock December 23, 2024 17:39 — with GitHub Actions Inactive
Comment on lines 258 to 265
defvar CudaUSMAspects = [
AspectUsm_device_allocations, AspectUsm_host_allocations,
AspectUsm_shared_allocations, AspectUsm_atomic_host_allocations,
AspectUsm_atomic_shared_allocations
];

defvar CudaMinAspects = !listconcat(CudaUSMAspects, [AspectGpu, AspectFp64, AspectOnline_compiler, AspectOnline_linker,
AspectQueue_profiling, AspectExt_intel_pci_address, AspectExt_intel_memory_bus_width,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CI:

# | Checking consistency of aspects for device nvidia_gpu_sm_86...
# | error: nvidia_gpu_sm_86 does not support aspect usm_system_allocations but it is declared in the device config file
# | error: nvidia_gpu_sm_86 does not support aspect usm_atomic_host_allocations but it is declared in the device config file
# | error: nvidia_gpu_sm_86 does not support aspect ext_intel_max_mem_bandwidth but it is declared in the device config file
# | note: the device nvidia_gpu_sm_86 supports aspect ext_intel_legacy_image but it is not declared in the device config file

Comment on lines +321 to +329
AspectExt_intel_pci_address, AspectExt_intel_device_id,
AspectExt_intel_memory_clock_rate, AspectExt_intel_memory_bus_width, AspectExt_intel_free_memory];

defvar HipUSMAspects = [
AspectUsm_device_allocations, AspectUsm_host_allocations,
AspectUsm_shared_allocations, AspectUsm_atomic_host_allocations,
AspectUsm_atomic_shared_allocations
];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CI:

# | Checking consistency of aspects for device amd_gpu_gfx1030...
# | error: amd_gpu_gfx1030 does not support aspect ext_intel_max_mem_bandwidth but it is declared in the device config file
# | error: amd_gpu_gfx1030 does not support aspect usm_system_allocations but it is declared in the device config file

@jzc jzc temporarily deployed to WindowsCILock January 7, 2025 22:21 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock January 7, 2025 23:04 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock January 9, 2025 15:34 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock January 9, 2025 15:34 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock January 9, 2025 16:07 — with GitHub Actions Inactive
@jzc jzc temporarily deployed to WindowsCILock January 9, 2025 16:26 — with GitHub Actions Inactive
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should also have a test which ensures consistency of values we can pass into -fsycl-targets and device config file and device_architecture extension, but that can be added as a follow-up. What's proposed here is already a good starting point to spot inconsistencies.

I'm a little concerned about us using std::string_view instead of llvm::StringRef, but at the same time I don't see anything which would prohibit us from using it. Moreover, an RFC to transition LLVM to std::string_view exists as well: https://discourse.llvm.org/t/migrating-llvm-stringref-to-std-string-view/82785

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants