Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default KvikIO parameters in cuDF: set the thread pool size to 4, and compatibility mode to ON #17185

Merged
9 changes: 6 additions & 3 deletions cpp/include/cudf/io/config_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,13 @@ bool is_gds_enabled();
bool is_kvikio_enabled();

/**
* @brief Set kvikIO thread pool size according to the environment variable KVIKIO_NTHREADS. If
* KVIKIO_NTHREADS is not set, use 8 threads by default.
* @brief Set KvikIO parameters, including:
* - Compatibility mode, according to the environment variable KVIKIO_COMPAT_MODE. If
* KVIKIO_COMPAT_MODE is not set, enable it by default, which enforces the use of POSIX I/O.
* - Thread pool size, according to the environment variable KVIKIO_NTHREADS. If KVIKIO_NTHREADS is
* not set, use 4 threads by default.
*/
void set_thread_pool_nthreads_from_env();
void set_up_kvikio();

} // namespace cufile_integration

Expand Down
7 changes: 5 additions & 2 deletions cpp/src/io/utilities/config_utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,14 @@ bool is_gds_enabled() { return is_always_enabled() or get_env_policy() == usage_

bool is_kvikio_enabled() { return get_env_policy() == usage_policy::KVIKIO; }

void set_thread_pool_nthreads_from_env()
void set_up_kvikio()
{
static std::once_flag flag{};
std::call_once(flag, [] {
auto nthreads = getenv_or<unsigned int>("KVIKIO_NTHREADS", 8U);
auto const compat_mode = kvikio::detail::getenv_or<bool>("KVIKIO_COMPAT_MODE", true);
kvikio::defaults::compat_mode_reset(compat_mode);

auto const nthreads = getenv_or<unsigned int>("KVIKIO_NTHREADS", 4u);
kvikio::defaults::thread_pool_nthreads_reset(nthreads);
});
}
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/io/utilities/data_sink.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ class file_sink : public data_sink {
if (!_output_stream.is_open()) { detail::throw_on_file_open_failure(filepath, true); }

if (cufile_integration::is_kvikio_enabled()) {
cufile_integration::set_thread_pool_nthreads_from_env();
cufile_integration::set_up_kvikio();
_kvikio_file = kvikio::FileHandle(filepath, "w");
CUDF_LOG_INFO("Writing a file using kvikIO, with compatibility mode {}.",
_kvikio_file.is_compat_mode_on() ? "on" : "off");
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/io/utilities/datasource.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class file_source : public datasource {
{
detail::force_init_cuda_context();
if (cufile_integration::is_kvikio_enabled()) {
cufile_integration::set_thread_pool_nthreads_from_env();
cufile_integration::set_up_kvikio();
_kvikio_file = kvikio::FileHandle(filepath);
CUDF_LOG_INFO("Reading a file using kvikIO, with compatibility mode {}.",
_kvikio_file.is_compat_mode_on() ? "on" : "off");
Expand Down
22 changes: 17 additions & 5 deletions docs/cudf/source/user_guide/io/io.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,15 +91,27 @@ SDK is available for download
[here](https://developer.nvidia.com/gpudirect-storage). GDS is also
included in CUDA Toolkit 11.4 and higher.

Use of GPUDirect Storage in cuDF is enabled by default, but can be
disabled through the environment variable `LIBCUDF_CUFILE_POLICY`.
Use of GPUDirect Storage in cuDF is disabled by default, but can be
enabled through the environment variable `LIBCUDF_CUFILE_POLICY`.
This variable also controls the GDS compatibility mode.

There are four valid values for the environment variable:

- "GDS": Enable GDS use; GDS compatibility mode is *off*.
- "ALWAYS": Enable GDS use; GDS compatibility mode is *on*.
- "KVIKIO": Enable GDS through [KvikIO](https://github.com/rapidsai/kvikio).
- "GDS": Enable GDS use. If the cuFile library cannot be properly loaded,
fall back to the GDS compatibility mode.
- "ALWAYS": Enable GDS use. If the cuFile library cannot be properly loaded,
throw an exception.
- "KVIKIO": Enable GDS compatibility mode through [KvikIO](https://github.com/rapidsai/kvikio).
Note that KvikIO also provides the environment variable `KVIKIO_COMPAT_MODE` for GDS
control that may alter the effect of "KVIKIO" option in cuDF:
- By default, `KVIKIO_COMPAT_MODE` is unset. In this case, cuDF enforces
the GDS compatibility mode, and the system configuration check for GDS I/O
is never performed.
- If `KVIKIO_COMPAT_MODE=ON`, this is the same with the above case.
- If `KVIKIO_COMPAT_MODE=OFF`, KvikIO enforces GDS I/O without system
configuration check, and will error out if GDS requirements are not met. The
only exceptional case is that if the system does not support files being
opened with the `O_DIRECT` flag, the GDS compatibility mode will be used.
- "OFF": Completely disable GDS use.
kingcrimsontianyu marked this conversation as resolved.
Show resolved Hide resolved

If no value is set, behavior will be the same as the "KVIKIO" option.
Expand Down
Loading