Skip to content

Commit

Permalink
Clean up libcu++ docs landing page (#1492)
Browse files Browse the repository at this point in the history
* Clean up main libcu++ docs.
  • Loading branch information
jrhemstad authored Mar 13, 2024
1 parent ae0ee04 commit 0ff0f61
Showing 1 changed file with 57 additions and 74 deletions.
131 changes: 57 additions & 74 deletions libcudacxx/docs/overview.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,71 @@
# libcu++: The C++ Standard Library for Your Entire System
# libcu++

<table><tr>
<th><b><a href="https://github.com/nvidia/libcudacxx/tree/main/examples">Examples</a></b></th>
<th><b><a href="https://godbolt.org/z/Kns9vhPEr">Godbolt</a></b></th>
<th><b><a href="https://nvidia.github.io/libcudacxx">Documentation</a></b></th>
</tr></table>
`libcu++` (`libcudacxx`) provides fundamental, idiomatic C++ abstractions that aim to make the lives of CUDA C++ developers easier.

**libcu++, the NVIDIA C++ Standard Library, is the C++ Standard Library for
your entire system.**
It provides a heterogeneous implementation of the C++ Standard Library that can
be used in and between CPU and GPU code.
Specifically, `libcu++` provides:
- C++ Standard Library features useable in both host and device code
- Extensions to C++ Standard Library features
- Fundamental, CUDA-specific programming model abstractions

If you know how to use your C++ Standard Library, then you know how to use
libcu++.
All you have to do is add `cuda/std/` to the start of your Standard Library
includes and `cuda::` before any uses of `std::`:
## C++ Standard Library Features

If you are a C++ developer, then you know the C++ Standard Library ([sometimes referred to as "The STL"](https://stackoverflow.com/questions/5205491/whats-the-difference-between-stl-and-c-standard-library)) as what comes along with your compiler and provides things like `std::string` or `std::vector` or `std::atomic`.
It provides the fundamental abstractions that C++ developers need to build high quality applications and libraries.

By default, these abstractions aren't available when writing CUDA C++ device code because they don't have the necessary `__host__ __device__` decorators, and their implementation may not be suitable for using in and across host and device code.

libcu++ aims to solve this problem by providing an opt-in, incremental, heterogeneous implementation of C++ Standard Library features:
1. **Opt-in**: It does not replace the Standard Library provided by your host compiler (aka anything in `std::`)
2. **Incremental**: It does not provide a complete C++ Standard Library implementation
3. **Heterogeneous**: It works in both host and device code, as well as passing between host and device code.

If you know how to use things like the `<atomic>` or `<type_traits>` headers from the C++ Standard Library, then you know how to use libcu++.

All you have to do is add `cuda/std/` to the start of your includes and `cuda::` before any uses of `std::`:

```cuda
#include <cuda/std/atomic>
cuda::std::atomic<int> x;
```

The NVIDIA C++ Standard Library is an open source project; it is available on
[GitHub] and included in the NVIDIA HPC SDK and CUDA Toolkit.
If you have one of those SDKs installed, no additional installation or compiler
flags are needed to use libcu++.
> [!NOTE]
> libcu++ does not provide its own documentation for Standard Library features.
> Instead, libcu++ [documents which Standard Library headers](https://nvidia.github.io/cccl/libcudacxx/standard_api.html) are made available, and defers documentation of individual features within those headers to other sources like [cppreference](https://en.cppreference.com/w/).
## C++ Standard Library Extensions

libcu++ provides CUDA C++ developers with familiar Standard Library utilties to improve productivity and flatten the learning curve of learning CUDA.
However, there are many aspects of writing high-performance CUDA C++ code that cannot be expressed through purely Standard conforming APIs.
For these cases, libcu++ also provides _extensions_ of Standard Library utilities.

For example, libcu++ extends `atomic<T>` and other synchornization primitives with the notion of a "thread scope" that controls the strength of the memory fence.

To use utilities that are extensions to Standard Library features, drop the `std`:
```cuda
#include <cuda/atomic>
cuda::atomic<int, cuda::thread_scope_device> x;
```

See the [Extended API](extended_api.md) section for more information.

## Fundamental CUDA-specific Abstractions

Some abstractions that libcu++ provide have no equivalent in the C++ Standard Library, but are otherwise abstractions fundamental to the CUDA C++ programming model.

For example, [`cuda::memcpy_async`](extended_api/asynchronous_operations/memcpy_async.md) is a vital abstraction for asynchronous data movement between global and shared memory.
This abstracts hardware features such as `LDGSTS` on Ampere, and the Tensor Memory Accelerator (TMA) on Hopper.

## `cuda::` and `cuda::std::`
See the [Extended API](extended_api.md) section for more information.

When used with NVCC, NVIDIA C++ Standard Library facilities live in their own
header hierarchy and namespace with the same structure as, but distinct from,
the host compiler's Standard Library:
## Summary: `std::`, `cuda::` and `cuda::std::`

* `std::`/`<*>`: When using NVCC, this is your host compiler's Standard Library
that works in `__host__` code only, although you can use the
`--expt-relaxed-constexpr` flag to use any `constexpr` functions in
`__device__` code.
With NVCC, libcu++ does not replace or interfere with host compiler's
Standard Library.
* `std::`/`<*>`: This is your host compiler's Standard Library that works in `__host__` code only, although you can use the `--expt-relaxed-constexpr` flag to use any `constexpr` functions in`__device__` code. libcu++ does not replace or interfere with host compiler's Standard Library.
* `cuda::std::`/`<cuda/std/*>`: Strictly conforming implementations of
facilities from the Standard Library that work in `__host__ __device__`
code.
* `cuda::`/`<cuda/*>`: Conforming extensions to the Standard Library that
work in `__host__ __device__` code.
* `cuda::device`/`<cuda/device/*>`: Conforming extensions to the Standard
Library that work only in `__device__` code.
* `cuda::`/`<cuda/*>`: Conforming extensions to the Standard Library that work in `__host__ __device__` code.
* `cuda::device`/`<cuda/device/*>`: Conforming extensions to the Standard Library that work only in `__device__` code.
* `cuda::ptx`: C++ convenience wrappers for inline PTX (only usable in `__device__` code).

```cuda
// Standard C++, __host__ only.
Expand All @@ -62,52 +83,18 @@ cuda::std::atomic<int> x;
cuda::atomic<int, cuda::thread_scope_block> x;
```

## libcu++ is Heterogeneous

The NVIDIA C++ Standard Library works across your entire codebase, both in and
across host and device code.
libcu++ is a C++ Standard Library for your entire system, not just your CPU or
GPU.
Everything in `cuda::` is `__host__ __device__`.

libcu++ facilities are designed to be passed between host and device code.
Unless otherwise noted, any libcu++ object which is copyable or movable can be
copied or moved between host and device code.

Synchronization objects work across host and device code, and can be used to
synchronize between host and device threads.
However, there are some restrictions to be aware of; please see the
[synchronization primitives section] for more details.

### `cuda::device::`

A small number of libcu++ facilities only work in device code, usually because
there is no sensible implementation in host code.

Such facilities live in `cuda::device::`.

## libcu++ is Incremental

Today, the NVIDIA C++ Standard Library delivers a high-priority subset of the
C++ Standard Library today, and each release increases the feature set.
But it is a subset; not everything is available today.
The [Standard API section] lists the facilities available and the releases they
were first introduced in.

## Licensing

The NVIDIA C++ Standard Library is an open source project developed on [GitHub].
libcu++ is an open source project developed on [GitHub].
It is NVIDIA's variant of [LLVM's libc++].
libcu++ is distributed under the [Apache License v2.0 with LLVM Exceptions].

## Conformance

The NVIDIA C++ Standard Library aims to be a conforming implementation of the
C++ Standard, [ISO/IEC IS 14882], Clause 16 through 32.
libcu++ aims to be a conforming implementation of the C++ Standard, [ISO/IEC IS 14882], Clause 16 through 32.

## ABI Evolution

The NVIDIA C++ Standard Library does not maintain long-term ABI stability.
libcu++ does not maintain long-term ABI stability.
Promising long-term ABI stability would prevent us from fixing mistakes and
providing best in class performance.
So, we make no such promises.
Expand All @@ -123,17 +110,13 @@ We recommend that you always recompile your code and dependencies with the


[GitHub]: https://github.com/nvidia/libcudacxx

[Standard API section]: standard_api.md
[Extended API section]: extended_api.md
[synchronization primitives section]: extended_api/synchronization_primitives.md
[versioning section]: releases/versioning.md

[documentation]: https://nvidia.github.io/libcudacxx

[LLVM's libc++]: https://libcxx.llvm.org
[Apache License v2.0 with LLVM Exceptions]: https://llvm.org/LICENSE.txt

[ISO/IEC IS 14882]: https://eel.is/c++draft

[live at head]: https://www.youtube.com/watch?v=tISy7EJQPzI&t=1032s

0 comments on commit 0ff0f61

Please sign in to comment.