Clean up libcu++ docs landing page (#1492)

* Clean up main libcu++ docs.
NVIDIA · Mar 13, 2024 · 0ff0f61 · 0ff0f61
1 parent ae0ee04
commit 0ff0f61
Showing 1 changed file with 57 additions and 74 deletions.
diff --git a/libcudacxx/docs/overview.md b/libcudacxx/docs/overview.md
@@ -1,50 +1,71 @@
-# libcu++: The C++ Standard Library for Your Entire System
+# libcu++
 
-<table><tr>
-<th><b><a href="https://github.com/nvidia/libcudacxx/tree/main/examples">Examples</a></b></th>
-<th><b><a href="https://godbolt.org/z/Kns9vhPEr">Godbolt</a></b></th>
-<th><b><a href="https://nvidia.github.io/libcudacxx">Documentation</a></b></th>
-</tr></table>
+`libcu++` (`libcudacxx`) provides fundamental, idiomatic C++ abstractions that aim to make the lives of CUDA C++ developers easier.
 
-**libcu++, the NVIDIA C++ Standard Library, is the C++ Standard Library for
-  your entire system.**
-It provides a heterogeneous implementation of the C++ Standard Library that can
-  be used in and between CPU and GPU code.
+Specifically, `libcu++` provides:
+- C++ Standard Library features useable in both host and device code 
+- Extensions to C++ Standard Library features
+- Fundamental, CUDA-specific programming model abstractions
 
-If you know how to use your C++ Standard Library, then you know how to use
-  libcu++.
-All you have to do is add `cuda/std/` to the start of your Standard Library
-  includes and `cuda::` before any uses of `std::`:
+## C++ Standard Library Features
+
+If you are a C++ developer, then you know the C++ Standard Library ([sometimes referred to as "The STL"](https://stackoverflow.com/questions/5205491/whats-the-difference-between-stl-and-c-standard-library)) as what comes along with your compiler and provides things like `std::string` or `std::vector` or `std::atomic`.
+It provides the fundamental abstractions that C++ developers need to build high quality applications and libraries.
+
+By default, these abstractions aren't available when writing CUDA C++ device code because they don't have the necessary `__host__ __device__` decorators, and their implementation may not be suitable for using in and across host and device code. 
+
+libcu++ aims to solve this problem by providing an opt-in, incremental, heterogeneous implementation of C++ Standard Library features:
+1. **Opt-in**: It does not replace the Standard Library provided by your host compiler (aka anything in `std::`)
+2. **Incremental**: It does not provide a complete C++ Standard Library implementation
+3. **Heterogeneous**: It works in both host and device code, as well as passing between host and device code.
+
+If you know how to use things like the `<atomic>` or `<type_traits>` headers from the C++ Standard Library, then you know how to use libcu++.
+
+All you have to do is add `cuda/std/` to the start of your includes and `cuda::` before any uses of `std::`:
 
 ```cuda
 #include <cuda/std/atomic>
 cuda::std::atomic<int> x;
 ```
 
-The NVIDIA C++ Standard Library is an open source project; it is available on
-  [GitHub] and included in the NVIDIA HPC SDK and CUDA Toolkit.
-If you have one of those SDKs installed, no additional installation or compiler
-  flags are needed to use libcu++.
+> [!NOTE]
+> libcu++ does not provide its own documentation for Standard Library features. 
+> Instead, libcu++ [documents which Standard Library headers](https://nvidia.github.io/cccl/libcudacxx/standard_api.html) are made available, and defers documentation of individual features within those headers to other sources like [cppreference](https://en.cppreference.com/w/).
+
+## C++ Standard Library Extensions
+
+libcu++ provides CUDA C++ developers with familiar Standard Library utilties to improve productivity and flatten the learning curve of learning CUDA.
+However, there are many aspects of writing high-performance CUDA C++ code that cannot be expressed through purely Standard conforming APIs.
+For these cases, libcu++ also provides _extensions_ of Standard Library utilities. 
+
+For example, libcu++ extends `atomic<T>` and other synchornization primitives with the notion of a "thread scope" that controls the strength of the memory fence. 
+
+To use utilities that are extensions to Standard Library features, drop the `std`:
+```cuda
+#include <cuda/atomic>
+cuda::atomic<int, cuda::thread_scope_device> x;
+```
+
+See the [Extended API](extended_api.md) section for more information. 
+
+## Fundamental CUDA-specific Abstractions
+
+Some abstractions that libcu++ provide have no equivalent in the C++ Standard Library, but are otherwise abstractions fundamental to the CUDA C++ programming model.
+
+For example, [`cuda::memcpy_async`](extended_api/asynchronous_operations/memcpy_async.md) is a vital abstraction for asynchronous data movement between global and shared memory.
+This abstracts hardware features such as `LDGSTS` on Ampere, and the Tensor Memory Accelerator (TMA) on Hopper. 
 
-## `cuda::` and `cuda::std::`
+See the [Extended API](extended_api.md) section for more information. 
 
-When used with NVCC, NVIDIA C++ Standard Library facilities live in their own
-  header hierarchy and namespace with the same structure as, but distinct from,
-  the host compiler's Standard Library:
+## Summary: `std::`, `cuda::` and `cuda::std::`
 
-* `std::`/`<*>`: When using NVCC, this is your host compiler's Standard Library
-      that works in `__host__` code only, although you can use the
-      `--expt-relaxed-constexpr` flag to use any `constexpr` functions in
-      `__device__` code.
-    With NVCC, libcu++ does not replace or interfere with host compiler's
-      Standard Library.
+* `std::`/`<*>`: This is your host compiler's Standard Library that works in `__host__` code only, although you can use the `--expt-relaxed-constexpr` flag to use any `constexpr` functions in`__device__` code. libcu++ does not replace or interfere with host compiler's Standard Library.
 * `cuda::std::`/`<cuda/std/*>`: Strictly conforming implementations of
       facilities from the Standard Library that work in `__host__ __device__`
       code.
-* `cuda::`/`<cuda/*>`: Conforming extensions to the Standard Library that
-      work in `__host__ __device__` code.
-* `cuda::device`/`<cuda/device/*>`: Conforming extensions to the Standard
-      Library that work only in `__device__` code.
+* `cuda::`/`<cuda/*>`: Conforming extensions to the Standard Library that work in `__host__ __device__` code.
+* `cuda::device`/`<cuda/device/*>`: Conforming extensions to the Standard Library that work only in `__device__` code.
+* `cuda::ptx`: C++ convenience wrappers for inline PTX (only usable in `__device__` code). 
 
 ```cuda
 // Standard C++, __host__ only.
@@ -62,52 +83,18 @@ cuda::std::atomic<int> x;
 cuda::atomic<int, cuda::thread_scope_block> x;
 ```
 
-## libcu++ is Heterogeneous
-
-The NVIDIA C++ Standard Library works across your entire codebase, both in and
-  across host and device code.
-libcu++ is a C++ Standard Library for your entire system, not just your CPU or
-  GPU.
-Everything in `cuda::` is `__host__ __device__`.
-
-libcu++ facilities are designed to be passed between host and device code.
-Unless otherwise noted, any libcu++ object which is copyable or movable can be
-  copied or moved between host and device code.
-
-Synchronization objects work across host and device code, and can be used to
-  synchronize between host and device threads.
-However, there are some restrictions to be aware of; please see the
-  [synchronization primitives section] for more details.
-
-### `cuda::device::`
-
-A small number of libcu++ facilities only work in device code, usually because
-  there is no sensible implementation in host code.
-
-Such facilities live in `cuda::device::`.
-
-## libcu++ is Incremental
-
-Today, the NVIDIA C++ Standard Library delivers a high-priority subset of the
-  C++ Standard Library today, and each release increases the feature set.
-But it is a subset; not everything is available today.
-The [Standard API section] lists the facilities available and the releases they
-  were first introduced in.
-
 ## Licensing
-
-The NVIDIA C++ Standard Library is an open source project developed on [GitHub].
+libcu++ is an open source project developed on [GitHub].
 It is NVIDIA's variant of [LLVM's libc++].
 libcu++ is distributed under the [Apache License v2.0 with LLVM Exceptions].
 
 ## Conformance
 
-The NVIDIA C++ Standard Library aims to be a conforming implementation of the
-  C++ Standard, [ISO/IEC IS 14882], Clause 16 through 32.
+libcu++ aims to be a conforming implementation of the C++ Standard, [ISO/IEC IS 14882], Clause 16 through 32.
 
 ## ABI Evolution
 
-The NVIDIA C++ Standard Library does not maintain long-term ABI stability.
+libcu++ does not maintain long-term ABI stability.
 Promising long-term ABI stability would prevent us from fixing mistakes and
   providing best in class performance.
 So, we make no such promises.
@@ -123,17 +110,13 @@ We recommend that you always recompile your code and dependencies with the
 
 
 [GitHub]: https://github.com/nvidia/libcudacxx
-
 [Standard API section]: standard_api.md
+[Extended API section]: extended_api.md
 [synchronization primitives section]: extended_api/synchronization_primitives.md
 [versioning section]: releases/versioning.md
-
 [documentation]: https://nvidia.github.io/libcudacxx
-
 [LLVM's libc++]: https://libcxx.llvm.org
 [Apache License v2.0 with LLVM Exceptions]: https://llvm.org/LICENSE.txt
-
 [ISO/IEC IS 14882]: https://eel.is/c++draft
-
 [live at head]: https://www.youtube.com/watch?v=tISy7EJQPzI&t=1032s