Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default KvikIO parameters in cuDF: set the thread pool size to 4, and compatibility mode to ON #17185

Merged
9 changes: 6 additions & 3 deletions cpp/include/cudf/io/config_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,13 @@ bool is_gds_enabled();
bool is_kvikio_enabled();

/**
* @brief Set kvikIO thread pool size according to the environment variable KVIKIO_NTHREADS. If
* KVIKIO_NTHREADS is not set, use 8 threads by default.
* @brief Set KvikIO parameters, including:
* - Compatibility mode, according to the environment variable KVIKIO_COMPAT_MODE. If
* KVIKIO_COMPAT_MODE is not set, enable it by default, which enforces the use of POSIX I/O.
* - Thread pool size, according to the environment variable KVIKIO_NTHREADS. If KVIKIO_NTHREADS is
* not set, use 4 threads by default.
*/
void set_thread_pool_nthreads_from_env();
void set_up_kvikio();

} // namespace cufile_integration

Expand Down
7 changes: 5 additions & 2 deletions cpp/src/io/utilities/config_utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,14 @@ bool is_gds_enabled() { return is_always_enabled() or get_env_policy() == usage_

bool is_kvikio_enabled() { return get_env_policy() == usage_policy::KVIKIO; }

void set_thread_pool_nthreads_from_env()
void set_up_kvikio()
{
static std::once_flag flag{};
std::call_once(flag, [] {
auto nthreads = getenv_or<unsigned int>("KVIKIO_NTHREADS", 8U);
auto const compat_mode = kvikio::detail::getenv_or<bool>("KVIKIO_COMPAT_MODE", true);
kvikio::defaults::compat_mode_reset(compat_mode);

auto const nthreads = getenv_or<unsigned int>("KVIKIO_NTHREADS", 4u);
kvikio::defaults::thread_pool_nthreads_reset(nthreads);
});
}
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/io/utilities/data_sink.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ class file_sink : public data_sink {
if (!_output_stream.is_open()) { detail::throw_on_file_open_failure(filepath, true); }

if (cufile_integration::is_kvikio_enabled()) {
cufile_integration::set_thread_pool_nthreads_from_env();
cufile_integration::set_up_kvikio();
_kvikio_file = kvikio::FileHandle(filepath, "w");
CUDF_LOG_INFO("Writing a file using kvikIO, with compatibility mode {}.",
_kvikio_file.is_compat_mode_on() ? "on" : "off");
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/io/utilities/datasource.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class file_source : public datasource {
{
detail::force_init_cuda_context();
if (cufile_integration::is_kvikio_enabled()) {
cufile_integration::set_thread_pool_nthreads_from_env();
cufile_integration::set_up_kvikio();
_kvikio_file = kvikio::FileHandle(filepath);
CUDF_LOG_INFO("Reading a file using kvikIO, with compatibility mode {}.",
_kvikio_file.is_compat_mode_on() ? "on" : "off");
Expand Down
14 changes: 9 additions & 5 deletions docs/cudf/source/user_guide/io/io.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,15 +91,19 @@ SDK is available for download
[here](https://developer.nvidia.com/gpudirect-storage). GDS is also
included in CUDA Toolkit 11.4 and higher.

Use of GPUDirect Storage in cuDF is enabled by default, but can be
disabled through the environment variable `LIBCUDF_CUFILE_POLICY`.
Use of GPUDirect Storage in cuDF is disabled by default, but can be
enabled through the environment variable `LIBCUDF_CUFILE_POLICY`.
This variable also controls the GDS compatibility mode.

There are four valid values for the environment variable:

- "GDS": Enable GDS use; GDS compatibility mode is *off*.
- "ALWAYS": Enable GDS use; GDS compatibility mode is *on*.
- "KVIKIO": Enable GDS through [KvikIO](https://github.com/rapidsai/kvikio).
- "GDS": Enable GDS use. If the cuFile library cannot be properly loaded,
fall back to the GDS compatibility mode.
- "ALWAYS": Enable GDS use. If the cuFile library cannot be properly loaded,
throw an exception.
- "KVIKIO": Enable GDS through [KvikIO](https://github.com/rapidsai/kvikio). If
kingcrimsontianyu marked this conversation as resolved.
Show resolved Hide resolved
KvikIO detects that the system is not properly configured for GDS, the I/O will
fall back to the GDS compatibility mode.
- "OFF": Completely disable GDS use.
kingcrimsontianyu marked this conversation as resolved.
Show resolved Hide resolved

If no value is set, behavior will be the same as the "KVIKIO" option.
Expand Down
Loading