Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to align CTS and Spec for Device #2597

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

martygrant
Copy link
Contributor

@martygrant martygrant commented Jan 22, 2025

  • Rework urDeviceGetInfoTest to move all enums to their own tests instead of a switch case - Restructure GetInfo CTS tests to use separate tests instead of a switch #2290
  • Remove the UR/OpenCL Device enum map function - no point maintaining both a switch case and a separate mapping function
  • Update some spec wording for consistency
  • Add missing Device info enums to OpenCL adapater
  • Add new urDevicePartition test for checking UR_DEVICE_INFO_PARENT_DEVICE
  • Move UUR_RETURN_ON_FATAL_FAILURE and UUR_ASSERT_SUCCESS_OR_UNSUPPORTED to join similar macros in checks.h
  • Update spec to say info queries that return char[] are null-terminated
  • Update UR_DEVICE_INFO_MAX_IMAGE_LINEAR_WIDTH/HEIGHT/PITCH should be uint32_t, matching SYCL RT
  • Add missing unsupported device enums to HIP adapter

intel/llvm#16746

@github-actions github-actions bot added loader Loader related feature/bug conformance Conformance test suite issues. images UR images specification Changes or additions to the specification experimental Experimental feature additions/changes/specification level-zero L0 adapter specific issues cuda CUDA adapter specific issues hip HIP adapter specific issues opencl OpenCL adapter specific issues native-cpu Native CPU adapter specific issues labels Jan 22, 2025
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch 4 times, most recently from 37c3c32 to b8a9271 Compare January 23, 2025 10:29
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch 3 times, most recently from 2cab273 to 1ce6d4e Compare January 23, 2025 16:17
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch from 1ce6d4e to 56bec8d Compare January 24, 2025 16:14
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch from 56bec8d to e153d00 Compare January 24, 2025 16:55
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch 2 times, most recently from caa66b2 to 7d961b4 Compare January 27, 2025 11:05
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch from 7d961b4 to b12864d Compare January 27, 2025 11:49
@martygrant martygrant marked this pull request as ready for review January 27, 2025 14:54
@martygrant martygrant requested review from a team as code owners January 27, 2025 14:54
@martygrant martygrant requested a review from npmiller January 27, 2025 14:54
@@ -381,11 +381,11 @@ etors:
[$x_device_handle_t[]] Return list of devices associated with a program.
This is either the list of devices associated with the context or a subset of those devices when the program is created using $xProgramCreateWithBinary.
- name: IL
desc: "[char[]] Return program IL if the program was created with $xProgramCreateWithIL, otherwise return size will be set to 0 and nothing will be returned."
desc: "[char[]] Return null-terminated program IL if the program was created with $xProgramCreateWithIL, otherwise return size will be set to 0 and nothing will be returned."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the program ones aren't null terminated, you can have a null byte in a program

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! I've updated and said these explicitly these aren't null terminated for extra clarify, if that's overkill let me know and I'll remove that

@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch from b12864d to 637c7f9 Compare January 29, 2025 11:32
Comment on lines 58 to 64
desc: "[size_t] returns the maximum linear width allowed for images allocated using USM"
desc: "[uint32_t] returns the maximum linear width allowed for images allocated using USM"
- name: MAX_IMAGE_LINEAR_HEIGHT_EXP
value: "0x2006"
desc: "[size_t] returns the maximum linear height allowed for images allocated using USM"
desc: "[uint32_t] returns the maximum linear height allowed for images allocated using USM"
- name: MAX_IMAGE_LINEAR_PITCH_EXP
value: "0x2007"
desc: "[size_t] returns the maximum linear pitch allowed for images allocated using USM"
Copy link
Contributor

@przemektmalon przemektmalon Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be size_t, the types in the SYCL RT are incorrect and don't reflect the BI extension spec.

I've created a PR changing these in the RT here: intel/llvm#16829

Copy link
Contributor

@przemektmalon przemektmalon Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intel/llvm#16829 has been merged.

This change can be reverted back to using size_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted these back to size_t, thanks!

ASSERT_SUCCESS(urDeviceGet(platform, UR_DEVICE_TYPE_ALL, device_count,
devices.data(), nullptr));

for (const ur_device_handle_t device : devices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have its own PlatformTest derived fixture, right now it's effectively doing

for (const ur_device_handle_t device : devices) {
  for (const ur_device_handle_t device : devices) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new test suite for this, using urAllDevicesTest

std::vector<char> property_value(property_size, 'x');
ASSERT_SUCCESS(urDeviceGetInfo(device, property_name, property_size,
property_value.data(), nullptr));
EXPECT_NE(property_value[0], 'x');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do this by initializing the vector with null characters instead? unless there's a spec somewhere that says the first byte of a pci address can't be 'x'. Same goes for other queries with this check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, an issue was pointed out with testing char returns in #2626 (comment) and I've added a new helper function stringPropertyIsValid for this

ASSERT_SUCCESS(urDeviceGetInfo(device, property_name, property_size,
&property_value, nullptr));

ASSERT_NE(property_value, 999);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the queries in question have specifically defined valid ranges I don't think we should be doing checks like this. 999 is unlikely but not necessarily invalid. I realize having no certainty as to what value we're expecting makes it hard to test if the adapter is returning anything at all.. maybe we could do have a validation helper for these cases like this:

size_t property_value = 0;
(do query)
auto returned_value = property_value;
property_value = 1;
(do query again)
ASSERT_EQ(property_value, returned_value);

that way we're sure the adapter is returning a value without putting any undocumented restrictions on what the value is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally makes sense but might this be a false positive if the value returned really is 1? what about using std::numeric_limits<size_t>::max() as an initial default value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's set up so that the initial value doesn't matter, as long as it changes between the two queries e.g.:

size_t property_value = 0;
(do query) - property_value is now 1 - verifying that the adapter is returning something
auto returned_value = property_value; // returned_value is now 1
property_value = 1; // change default value  - if the adapter returned 0 the first time this second call will verify that it's returning a value when it overwrites
(do query again) - property_value remains 1
ASSERT_EQ(property_value, returned_value); // this ensures the return is consistent between the two calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, have added a new macro ASSERT_GET_INFO as a helper to validate these kind of queries

- Rework urDeviceGetInfoTest to move all enums to their own tests instead of a switch case - oneapi-src#2290
- Remove the UR/OpenCL Device enum map function - no point maintaining both a switch case and a separate mapping function
- Update some spec wording for consistency
- Add missing Device info enums to OpenCL adapater
- Add new urDevicePartition test for checking UR_DEVICE_INFO_PARENT_DEVICE
- Move UUR_RETURN_ON_FATAL_FAILURE and UUR_ASSERT_SUCCESS_OR_UNSUPPORTED to join similar macros in checks.h
- Update spec to say info queries that return char[] are null-terminated
- Update UR_DEVICE_INFO_MAX_IMAGE_LINEAR_WIDTH/HEIGHT/PITCH should be uint32_t, matching SYCL RT
- Add missing unsupported device enums to HIP adapter

Addressed feedback:
- Reverted bindless image types back to size_t
- Moved ComponentDevices test to a separate PlatformTest derived fixture
- Added new ASSERT_GET_INFO macro as a helper to validate simple numerical GetInfo queries
- Added new stringPropertyIsValid helper function to validate string queries
@martygrant martygrant force-pushed the martin/device-cts-spec-gap-redo branch from 637c7f9 to 0ff7ae2 Compare January 31, 2025 10:59
martygrant added a commit to martygrant/unified-runtime that referenced this pull request Jan 31, 2025
…of a switch and rework various urXGetInfo tests:

- Housekeeping pass to tidy up tests, renaming and adding consts where appropriate
- Added ASSERT_INFO_CALL and uur::stringPropertyIsValid helpers from oneapi-src#2597
- Used above helpers across all tests to better validate numerical and char[] queries, and reworked some char[] queries
- Removed any non useful checks on vector sizes after passing them into a urXGetInfo query
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conformance Conformance test suite issues. cuda CUDA adapter specific issues experimental Experimental feature additions/changes/specification hip HIP adapter specific issues images UR images level-zero L0 adapter specific issues loader Loader related feature/bug native-cpu Native CPU adapter specific issues opencl OpenCL adapter specific issues specification Changes or additions to the specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants