Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly specify used NVTX3 C++ wrapper version #1688

Closed

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber requested review from a team as code owners May 2, 2024 12:59
@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label May 2, 2024
@gevtushenko
Copy link
Collaborator

@bernhardmgruber how does this change interact with user code relying on nvtx API?

Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rushed approving the PR. Answering my own question, the change doesn't play well with user code:

#include <cub/device/device_reduce.cuh>
#include <nvtx3/nvtx3.hpp>

struct my_domain{ static constexpr char const* name{"my domain"}; };

int main() {
  nvtx3::scoped_range_in<my_domain> domain; // fails to compile
}

@bernhardmgruber
Copy link
Contributor Author

#include <cub/device/device_reduce.cuh>
#include <nvtx3/nvtx3.hpp>

struct my_domain{ static constexpr char const* name{"my domain"}; };

int main() {
  nvtx3::scoped_range_in<my_domain> domain; // fails to compile
}

Where would the #include <nvtx3/nvtx3.hpp> resolve to? This header does not ship in any CTK and is also not present in this repository. Funnily though, if I compile this code, nvcc errors the same way as you described and does not get stuck at the second include (which does not exist).

@bernhardmgruber
Copy link
Contributor Author

@bernhardmgruber how does this change interact with user code relying on nvtx API?

After looking through nvtx3.hpp, I understand the header to work as follows:

  • all symbols are exported in namespace nvtx3::v1, and the namespace v1 is marked as inline. Therefore, all symbols are available as nvtx3::* and nvtx3::v1::*.
  • when NVTX3_CPP_REQUIRE_EXPLICIT_VERSION is defined before including the header, v1 is not marked as inline. Symbols are only available as nvtx3::v1::*.
  • nvtx3.hpp does not have traditional header guards and remembers which API declarations it already provided instead. It defines NVTX3_CPP_DEFINITIONS_V1_0 for example. So if a different nvtx3.hpp would be included that defines a newer API, that header would not emit the V1 declarations, but only add declarations of later versions.
  • API declarations are only emitted once per version
  • Whether the version namespaces are inline or not can only be chosen at the time the declarations are emitted.

Therefore, when we define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION and then include nvtx3.hpp, it will emit a non-inlined V1 API. Then we #undef NVTX3_CPP_REQUIRE_EXPLICIT_VERSION. When the user later includes an nvtx3.hpp containing the same API version (major and minor), the header does nothing. If the user includes an nvtx3.hpp containing a different API version, that version will be emitted with inline namespace in addition to the explicit version that the include from CUB emitted.

Bottom line: this PR breaks user-side use of NVTX3 if they happen to use the non-explicit API (which is the default) and use the same major and minor API version (also the default because only API 1.0 exists).

@bernhardmgruber bernhardmgruber marked this pull request as draft May 2, 2024 15:53
@bernhardmgruber
Copy link
Contributor Author

#include <cub/device/device_reduce.cuh>
#include <nvtx3/nvtx3.hpp>

struct my_domain{ static constexpr char const* name{"my domain"}; };

int main() {
  nvtx3::scoped_range_in<my_domain> domain; // fails to compile
}

Where would the #include <nvtx3/nvtx3.hpp> resolve to? This header does not ship in any CTK and is also not present in this repository. Funnily though, if I compile this code, nvcc errors the same way as you described and does not get stuck at the second include (which does not exist).

After playing with this for a while and supplying nvtx3.hpp, either via an include path pointing to a cloned NVTX repository or our vendored nvtx3.hpp, I conclude that we cannot define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION as recommended by the NVTX documentation, because it can break usercode.

Copy link

copy-pr-bot bot commented May 3, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor Author

As expected, with the test from #1690, this PR now fails with:

cub/test/CMakeFiles/cub.cpp17.test.nvtx_in_usercode.dir/test_nvtx_in_usercode.cu.o
  /home/coder/cccl/cub/test/test_nvtx_in_usercode.cu(17): error: namespace "nvtx3" has no member "scoped_range"
      nvtx3::scoped_range range("user-range");
             ^
  
  1 error detected in the compilation of "/home/coder/cccl/cub/test/test_nvtx_in_usercode.cu".

We thus cannot use the explicitly versioned API from the NVTX3 C++ wrapper.

@gevtushenko
Copy link
Collaborator

@bernhardmgruber I believe this was closed prematurely. We discussed that we don't need NVTX3_CPP_REQUIRE_EXPLICIT_VERSION in CUB sources, but I think we need explicit ::v1 when referring to nvtx. Without ::v1 the following code breaks:

#define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
#include <nvtx3/nvtx3.hpp>
#include <cub/device/device_reduce.cuh>

int main() {}

@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented May 7, 2024

I think we need explicit ::v1 when referring to nvtx. Without ::v1 the following code breaks:

#define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
#include <nvtx3/nvtx3.hpp>
#include <cub/device/device_reduce.cuh>

int main() {}

This code does indeed break. However, CUB internally does this:

#  if __has_include(<nvtx3/nvtx3.hpp>)
#    include <nvtx3/nvtx3.hpp>
#  else // __has_include(<nvtx3/nvtx3.hpp>)
#    include "nvtx3.hpp"
#  endif // __has_include(<nvtx3/nvtx3.hpp>)

to choose whether to pick the vendored nvtx3.hpp (V1), or take the one available in the include directories (e.g. from the CTK, could be any version). So if CUB would use explicitely versioned symbols like:

::nvtx3::v1::scoped_range_in<...>

and the file <nvtx3/nvtx3.hpp> happens to exist in the include directories and provide a version different than V1. CUB would break, because it would not take the vendored nvtx3.hpp, but the user provided and version incompatible <nvtx3/nvtx3.hpp>.

@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented May 7, 2024

As discussed above, we can also not ship our own nvtx3.hpp that we use with an explicit version (define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION, include, undef), becasue if the user happens to use the same version as we use internally but wants to have the implicit versions, the user is broken as well.

I see no way how to create a situation where neither CUB nor the user can be broken in some way when various or equal, explicit or implicit nvtx3 versions are in the game, without modifiying nvtx3.hpp to allow including the same header multiple times and letting it re-emit the API when NVTX3_CPP_REQUIRE_EXPLICIT_VERSION is defined or not. So this has to be fixed upstream in NVTX. Should we attempt that?

The root culprit is simply that whatever API version CUB uses, the user cannot have the same API version with the opposite explicitness.

@bernhardmgruber
Copy link
Contributor Author

@gevtushenko I created an issue that covers the failing program you listed above: #1750

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants