From d3dcc75c8f82d3ba2bcf1efed493ebbf02b2e6a1 Mon Sep 17 00:00:00 2001
From: Ashwin Srinath <3190405+shwina@users.noreply.github.com>
Date: Wed, 8 Nov 2023 12:15:57 -0500
Subject: [PATCH] Update README (#14374)
Authors:
- Ashwin Srinath (https://github.com/shwina)
Approvers:
- Bradley Dice (https://github.com/bdice)
- GALI PREM SAGAR (https://github.com/galipremsagar)
URL: https://github.com/rapidsai/cudf/pull/14374
---
README.md | 73 +++++++++++++++++++++++++++++--------------------------
1 file changed, 39 insertions(+), 34 deletions(-)
diff --git a/README.md b/README.md
index 5f2ce014dba..677cfc89d52 100644
--- a/README.md
+++ b/README.md
@@ -1,57 +1,62 @@
#
cuDF - GPU DataFrames
-**NOTE:** For the latest stable [README.md](https://github.com/rapidsai/cudf/blob/main/README.md) ensure you are on the `main` branch.
+## 📢 cuDF can now be used as a no-code-change accelerator for pandas! To learn more, see [here](https://rapids.ai/cudf-pandas/)!
-## Resources
-
-- [cuDF Reference Documentation](https://docs.rapids.ai/api/cudf/stable/): Python API reference, tutorials, and topic guides.
-- [libcudf Reference Documentation](https://docs.rapids.ai/api/libcudf/stable/): C/C++ CUDA library API reference.
-- [Getting Started](https://rapids.ai/start.html): Instructions for installing cuDF.
-- [RAPIDS Community](https://rapids.ai/community.html): Get help, contribute, and collaborate.
-- [GitHub repository](https://github.com/rapidsai/cudf): Download the cuDF source code.
-- [Issue tracker](https://github.com/rapidsai/cudf/issues): Report issues or request features.
-
-## Overview
-
-Built based on the [Apache Arrow](http://arrow.apache.org/) columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.
+cuDF is a GPU DataFrame library for loading joining, aggregating,
+filtering, and otherwise manipulating data. cuDF leverages
+[libcudf](https://docs.rapids.ai/api/libcudf/stable/), a
+blazing-fast C++/CUDA dataframe library and the [Apache
+Arrow](https://arrow.apache.org/) columnar format to provide a
+GPU-accelerated pandas API.
-cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
+You can import `cudf` directly and use it like `pandas`:
-For example, the following snippet downloads a CSV, then uses the GPU to parse it into rows and columns and run calculations:
```python
-import cudf, requests
+import cudf
+import requests
from io import StringIO
url = "https://github.com/plotly/datasets/raw/master/tips.csv"
-content = requests.get(url).content.decode('utf-8')
+content = requests.get(url).content.decode("utf-8")
tips_df = cudf.read_csv(StringIO(content))
-tips_df['tip_percentage'] = tips_df['tip'] / tips_df['total_bill'] * 100
+tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"] * 100
# display average tip by dining party size
-print(tips_df.groupby('size').tip_percentage.mean())
+print(tips_df.groupby("size").tip_percentage.mean())
```
-Output:
-```
-size
-1 21.729201548727808
-2 16.571919173482897
-3 15.215685473711837
-4 14.594900639351332
-5 14.149548965142023
-6 15.622920072028379
-Name: tip_percentage, dtype: float64
-```
+Or, you can use cuDF as a no-code-change accelerator for pandas, using
+[`cudf.pandas`](https://docs.rapids.ai/api/cudf/stable/cudf_pandas).
+`cudf.pandas` supports 100% of the pandas API, utilizing cuDF for
+supported operations and falling back to pandas when needed:
-For additional examples, browse our complete [API documentation](https://docs.rapids.ai/api/cudf/stable/), or check out our more detailed [notebooks](https://github.com/rapidsai/notebooks-contrib).
+```python
+%load_ext cudf.pandas # pandas operations now use the GPU!
-## Quick Start
+import pandas as pd
+import requests
+from io import StringIO
-Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapidsai/), choosing a tag based on the NVIDIA CUDA version you're running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize cuDF.
+url = "https://github.com/plotly/datasets/raw/master/tips.csv"
+content = requests.get(url).content.decode("utf-8")
-## Installation
+tips_df = pd.read_csv(StringIO(content))
+tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"] * 100
+# display average tip by dining party size
+print(tips_df.groupby("size").tip_percentage.mean())
+```
+
+## Resources
+
+- [Try cudf.pandas now](https://nvda.ws/rapids-cudf): Explore `cudf.pandas` on a free GPU enabled instance on Google Colab!
+- [Install](https://rapids.ai/start.html): Instructions for installing cuDF and other [RAPIDS](https://rapids.ai) libraries.
+- [cudf (Python) documentation](https://docs.rapids.ai/api/cudf/stable/)
+- [libcudf (C++/CUDA) documentation](https://docs.rapids.ai/api/libcudf/stable/)
+- [RAPIDS Community](https://rapids.ai/community.html): Get help, contribute, and collaborate.
+
+## Installation
### CUDA/GPU requirements