Skip to content

Commit

Permalink
Merge branch 'main' into feat/selectors-by-datetime
Browse files Browse the repository at this point in the history
  • Loading branch information
MarcoGorelli authored Feb 8, 2025
2 parents 610d553 + 22274d8 commit 242d2fa
Show file tree
Hide file tree
Showing 12 changed files with 171 additions and 71 deletions.
5 changes: 5 additions & 0 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,12 @@ jobs:
matrix:
python-version: ["3.11", "3.13"]
os: [ubuntu-latest]
include:
- python-version: "3.11"
polars_streaming: true
runs-on: ${{ matrix.os }}
env:
NARWHALS_POLARS_NEW_STREAMING: ${{ matrix.polars_streaming == true }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
Expand Down
98 changes: 68 additions & 30 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
Thank you for your interest in contributing to Narwhals! Any kind of improvement is welcome!

## Local development vs Codespaces

You can contribute to Narwhals in your local development environment, using python3, git and your editor of choice.
You can also contribute to Narwhals using [Github Codespaces](https://docs.github.com/en/codespaces/overview) - a development environment that's hosted in the cloud.
This way you can easily start to work from your browser without installing git and cloning the repo.
Expand All @@ -16,7 +17,7 @@ Open your terminal and run the following command:

```bash
git --version
```
```

If the output looks like `git version 2.34.1` and you have a personal account on GitHub - you're good to go to the next step.
If the terminal output informs about `command not found` you need to [install git](https://docs.github.com/en/get-started/quickstart/set-up-git).
Expand All @@ -40,7 +41,7 @@ Open a terminal, choose the directory where you would like to have Narwhals repo

```bash
git clone <url you just copied>
```
```

for example:

Expand All @@ -54,22 +55,21 @@ You should then navigate to the folder you just created:
cd narwhals-dev
```


### 4. Add the `upstream` remote and fetch from it

```bash
git remote add upstream [email protected]:narwhals-dev/narwhals.git
git fetch upstream
```
```

Check to see the remote has been added with `git remote -v`, you should see something like this:

```bash
git remote -v
origin [email protected]:YOUR-GITHUB-USERNAME/narwhals.git (fetch)
origin [email protected]:YOUR-GITHUB-USERNAME/narwhals.git (push)
upstream [email protected]:narwhals-dev/narwhals.git (fetch)
upstream [email protected]:narwhals-dev/narwhals.git (push)
origin [email protected]:YOUR-GITHUB-USERNAME/narwhals.git (fetch)
origin [email protected]:YOUR-GITHUB-USERNAME/narwhals.git (push)
upstream [email protected]:narwhals-dev/narwhals.git (fetch)
upstream [email protected]:narwhals-dev/narwhals.git (push)
```

where `YOUR-GITHUB-USERNAME` will be your GitHub user name.
Expand All @@ -86,35 +86,47 @@ If you want to run PySpark-related tests, you'll need to have Java installed. Re

1. Make sure you have Python3.12 installed, create a virtual environment,
and activate it. If you're new to this, here's one way that we recommend:
1. Install uv: https://github.com/astral-sh/uv?tab=readme-ov-file#getting-started
1. Install uv (see [uv getting started](https://github.com/astral-sh/uv?tab=readme-ov-file#getting-started))
or make sure it is up-to-date with:
```

```terminal
uv self update
```
2. Install Python3.12:
```
```terminal
uv python install 3.12
```
3. Create a virtual environment:
```
```terminal
uv venv -p 3.12 --seed
```
4. Activate it. On Linux, this is `. .venv/bin/activate`, on Windows `.\.venv\Scripts\activate`.
2. Install Narwhals: `uv pip install -e ".[dev, core, docs]"`. This will include fast-ish core libraries.
If you also want to test other libraries like Dask , PySpark, and Modin, you can install them too with
`uv pip install -e ".[dev, core, docs, dask, pyspark, modin]"`.
3. Install a fork of griffe:
```
```terminal
uv pip install git+https://github.com/MarcoGorelli/griffe.git@no-overloads
```
This is hopefully temporary until https://github.com/mkdocstrings/mkdocstrings/issues/716

This is hopefully temporary until [mkdocstrings#716](https://github.com/mkdocstrings/mkdocstrings/issues/716)
is addressed.

You should also install pre-commit:
```

```terminal
uv pip install pre-commit
pre-commit install
```

This will automatically format and lint your code before each commit, and it will block the commit if any issues are found.

#### Option 2: use python3-venv
Expand Down Expand Up @@ -165,42 +177,69 @@ run them by passing the `--runslow` flag to PyTest.
To keep local development test times down, Dask and Modin are excluded from dev
dependencies, and their tests only run in CI. If you install them with

```
```terminal
uv pip install -U dask[dataframe] modin
```

then their tests will run too.

#### Testing cuDF

We can't currently test in CI against cuDF, but you can test it manually in Kaggle using GPUs. Please follow this [Kaggle notebook](https://www.kaggle.com/code/marcogorelli/testing-cudf-in-narwhals) to run the tests.

### 8. Building docs
### 8. Writing the doc(strings)

If you are adding a new feature or changing an existing one, you should also update the documentation and the docstrings
to reflect the changes.

Writing the docstring in Narwhals is not an exact science, but we have some high level guidelines (if in doubt just ask us in the PR):

- The examples should be clear and to the point.
- The examples should import _one_ dataframe library, create a dataframe and exemplify the Narwhals functionality.
- We strive for balancing the use of different backend across all our docstrings examples.
- There are exceptions to the above rules!

Here an example of a docstring:

```python
>>> import pyarrow as pa
>>> import narwhals as nw
>>> df_native = pa.table({"foo": [1, 2], "bar": [6.0, 7.0]})
>>> df = nw.from_native(df_native)
>>> df.estimated_size()
32
```

Full discussion at [narwhals#1943](https://github.com/narwhals-dev/narwhals/issues/1943).

### 9. Building the docs

To build the docs, run `mkdocs serve`, and then open the link provided in a browser.
The docs should refresh when you make changes. If they don't, press `ctrl+C`, and then
do `mkdocs build` and then `mkdocs serve`.

### 9. Pull requests
### 10. Pull requests

When you have resolved your issue, [open a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) in the Narwhals repository.

Please adhere to the following guidelines:

1. Start your pull request title with a [conventional commit](https://www.conventionalcommits.org/) tag. This helps us add your contribution to the right section of the changelog. We use "Type" from the [Angular convention](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type).<br>
TLDR:
The PR title should start with any of these abbreviations: `build`, `chore`, `ci`, `depr`,
`docs`, `feat`, `fix`, `perf`, `refactor`, `release`, `test`. Add a `!`at the end, if it is a breaking change. For example `refactor!`.
<br>
1. Start your pull request title with a [conventional commit](https://www.conventionalcommits.org/) tag. This helps us add your contribution to the right section of the changelog. We use "Type" from the [Angular convention](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type).

**TLDR**: The PR title should start with any of these abbreviations:
`build`, `chore`, `ci`, `depr`, `docs`, `feat`, `fix`, `perf`, `refactor`, `release`, `test`.
Add a `!`at the end, if it is a breaking change. For example `refactor!`.

2. This text will end up in the [changelog](https://github.com/narwhals-dev/narwhals/releases).
3. Please follow the instructions in the pull request form and submit.
3. Please follow the instructions in the pull request form and submit.

## Working with Codespaces

Codespaces is a great way to work on Narwhals without the need of configuring your local development environment.
Every GitHub.com user has a monthly quota of free use of GitHub Codespaces, and you can start working in a codespace without providing any payment details.
You'll be informed per email if you'll be close to using 100% of included services.
To learn more about it visit [GitHub Docs](https://docs.github.com/en/codespaces/overview)


### 1. Make sure you have GitHub account

If you're new to GitHub, you'll need to create an account on [GitHub.com](https://github.com/) and verify your email address.
Expand All @@ -210,16 +249,15 @@ If you're new to GitHub, you'll need to create an account on [GitHub.com](https:
Go to the [main project page](https://github.com/narwhals-dev/narwhals).
Fork the repository by clicking on the fork button. You can find it in the right corner on the top of the page.

### 3. Create codespace
### 3. Create codespace

Go to the forked repository on your GitHub account - you'll find it on your account in the tab Repositories.
Go to the forked repository on your GitHub account - you'll find it on your account in the tab Repositories.
Click on the green `Code` button and navigate to the `Codespaces` tab.
Click on the green button `Create codespace on main` - it will open a browser version of VSCode,
with the complete repository and git installed.
You can now proceed with the steps [4. Setting up your environment](#4-setting-up-your-environment) up to [8. Pull request](#8-pull-requests)
with the complete repository and git installed.
You can now proceed with the steps [5. Setting up your environment](#5-setting-up-your-environment) up to [10. Pull request](#10-pull-requests)
listed above in [Working with local development environment](#working-with-local-development-environment).


## How it works

If Narwhals looks like underwater unicorn magic to you, then please read
Expand Down
32 changes: 20 additions & 12 deletions narwhals/_expression_parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,14 @@
from typing import Sequence
from typing import TypeVar
from typing import Union
from typing import cast
from typing import overload

from narwhals.dependencies import is_numpy_array
from narwhals.exceptions import InvalidIntoExprError
from narwhals.exceptions import LengthChangingExprError
from narwhals.utils import Implementation
from narwhals.utils import is_compliant_expr
from narwhals.utils import is_compliant_series

if TYPE_CHECKING:
from narwhals._arrow.expr import ArrowExpr
Expand Down Expand Up @@ -82,14 +83,21 @@ def evaluate_into_exprs(
return series


@overload
def maybe_evaluate_expr(
df: CompliantDataFrame, expr: CompliantExpr[CompliantSeriesT_co]
) -> Sequence[CompliantSeriesT_co]: ...


@overload
def maybe_evaluate_expr(df: CompliantDataFrame, expr: T) -> T: ...


def maybe_evaluate_expr(
df: CompliantDataFrame, expr: CompliantExpr[CompliantSeriesT_co] | T
) -> Sequence[CompliantSeriesT_co] | T:
"""Evaluate `expr` if it's an expression, otherwise return it as is."""
if hasattr(expr, "__narwhals_expr__"):
compliant_expr = cast("CompliantExpr[Any]", expr)
return compliant_expr(df)
return expr
return expr(df) if is_compliant_expr(expr) else expr


def parse_into_exprs(
Expand Down Expand Up @@ -123,13 +131,13 @@ def parse_into_expr(
- if it's a string, then convert it to an expression
- else, raise
"""
if hasattr(into_expr, "__narwhals_expr__"):
return into_expr # type: ignore[return-value]
if hasattr(into_expr, "__narwhals_series__"):
if is_compliant_expr(into_expr):
return into_expr
if is_compliant_series(into_expr):
return namespace._create_expr_from_series(into_expr) # type: ignore[no-any-return, attr-defined]
if is_numpy_array(into_expr):
series = namespace._create_compliant_series(into_expr) # type: ignore[attr-defined]
return namespace._create_expr_from_series(series) # type: ignore[no-any-return, attr-defined]
series = namespace._create_compliant_series(into_expr)
return namespace._create_expr_from_series(series)
raise InvalidIntoExprError.from_invalid_type(type(into_expr))


Expand Down Expand Up @@ -177,7 +185,7 @@ def reuse_series_implementation(
plx = expr.__narwhals_namespace__()

def func(df: CompliantDataFrame) -> Sequence[CompliantSeries]:
_kwargs = { # type: ignore[var-annotated]
_kwargs = {
arg_name: maybe_evaluate_expr(df, arg_value)
for arg_name, arg_value in expressifiable_args.items()
}
Expand Down Expand Up @@ -284,7 +292,7 @@ def combine_evaluate_output_names(
def evaluate_output_names(
df: CompliantDataFrame | CompliantLazyFrame,
) -> Sequence[str]:
if not hasattr(exprs[0], "__narwhals_expr__"): # pragma: no cover
if not is_compliant_expr(exprs[0]): # pragma: no cover
msg = f"Safety assertion failed, expected expression, got: {type(exprs[0])}. Please report a bug."
raise AssertionError(msg)
return exprs[0]._evaluate_output_names(df)[:1]
Expand Down
11 changes: 8 additions & 3 deletions narwhals/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -861,16 +861,19 @@ def estimated_size(self: Self, unit: SizeUnit = "b") -> int | float:

@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self, key: str | tuple[slice | Sequence[int] | np.ndarray, int | str]
self: Self,
item: str | tuple[slice | Sequence[int] | np.ndarray, int | str],
) -> Series[Any]: ...

@overload
def __getitem__(
self: Self,
key: (
slice
item: (
int
| slice
| Sequence[int]
| Sequence[str]
| np.ndarray
| tuple[
slice | Sequence[int] | np.ndarray, slice | Sequence[int] | Sequence[str]
]
Expand All @@ -880,9 +883,11 @@ def __getitem__(
self: Self,
item: (
str
| int
| slice
| Sequence[int]
| Sequence[str]
| np.ndarray
| tuple[slice | Sequence[int] | np.ndarray, int | str]
| tuple[
slice | Sequence[int] | np.ndarray, slice | Sequence[int] | Sequence[str]
Expand Down
2 changes: 1 addition & 1 deletion narwhals/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def __init__(
*,
level: Literal["full", "lazy", "interchange"],
) -> None:
self._level = level
self._level: Literal["full", "lazy", "interchange"] = level
if hasattr(series, "__narwhals_series__"):
self._compliant_series = series.__narwhals_series__()
else: # pragma: no cover
Expand Down
9 changes: 6 additions & 3 deletions narwhals/stable/v1/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,13 +133,16 @@ def _lazyframe(self: Self) -> type[LazyFrame[Any]]:

@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self, key: str | tuple[slice | Sequence[int] | np.ndarray, int | str]
self: Self,
item: str | tuple[slice | Sequence[int] | np.ndarray, int | str],
) -> Series: ...
@overload
def __getitem__(
self: Self,
key: (
slice
item: (
int
| slice
| np.ndarray
| Sequence[int]
| Sequence[str]
| tuple[
Expand Down
Loading

0 comments on commit 242d2fa

Please sign in to comment.