Configurable plotting backends #20885

MarcoGorelli · 2025-01-24T08:13:53Z

Polars has a plot namespace which allows to conveniently create simple plots: https://docs.pola.rs/api/python/dev/reference/dataframe/plot.html. It currently uses Altair

Several users have said that for some use cases they prefer other libraries. Plotly in particular comes up quite a lot. The good news is that when Plotly v6.0.0 comes out, it will have native support for Polars, meaning that plotly.express will be able to plot Polars dataframes without converting to other dataframe libraries and without other dataframe libraries being required. The perf increase is often ~3x, but even >10x for some plots, especially those which involve grouping over multiple dimensions, see https://plotly.com/blog/chart-smarter-not-harder-universal-dataframe-support/ for some results.

It could be nice there for to make .plot configurable, so that users can do:

df.plot(backend='plotly').line(x='a', y='b')
df.plot(backend='altair').line(x='a', y='b')
df.plot.line(x='a', y='b')  # defaults to Altair
pl.Config.set_plotting_backend('plotly')
df.plot.line(x='a', y='b')  # now, it uses Plotly (only supported for plotly v6.0.0+)

A few principles to abide by, to make sure this doesn't get out of hand:

This shouldn't do anything clever, and should just be an entrypoint to libraries which specialise in plotting
the arguments in polars.DataFrame.plot.line should be some simple dimensions which all plotting backends can respect (e.g. x, y, color, ...). Anything else should be backend-specific
configuration is left to the plotting backends. In Altair this can often be done with .properties or .configure_*, in Plotly there's various methods update_* methods. Links to their respective docs can be provided, but Polars should make no attempt to standardise on these
Any library which is added as a plotting backend should support Polars directly, rather than "if we receive a dataframe which isn't pandas then we convert to pandas and do all transformations in pandas". In 2025 at least, I think it's OK to set that as the minimum bar 😄

I don't have a tonne of time now unfortunately, but if anyone wanted to implemented this I'd make it a priority to review it. Else, I will get to it, just not in the immediate future

The text was updated successfully, but these errors were encountered:

deanm0000 · 2025-01-24T11:18:29Z

I think static typing will be difficult to impossible unless instead of all of them using plot, they each get their own like df.px.line or df.alt.line it doesn't matter if you just want the graph immediately but if you ever need to chain some library specific method then you need the static typer to know.

deanm0000 · 2025-01-24T11:20:36Z

I guess you'd have to put the backend argument in the final method rather than the namespace then you could use overloads but that is suboptimal since you're just going to want to set it in config and not type it out everywhere

MarcoGorelli · 2025-01-24T11:24:12Z

🤔 that's a good point, typing the return type may be problematic in cases where the backend is set just in pl.Config (as opposed to as an argument in .plot)

they each get their own like df.px.line or df.alt.line

🤔 not sure about putting all these on polars.DataFrame, but I guess df.plot.px.line could work, if it's not deemed too long?

etrotta · 2025-01-24T11:35:34Z

I think static typing will be difficult to impossible unless instead of all of them using plot, they each get their own like df.px.line or df.alt.line it doesn't matter if you just want the graph immediately but if you ever need to chain some library specific method then you need the static typer to know.

You could add @overloads for each backend, setting a string literal for that parameter and maybe even use a TypedDict for kwargs relevant to that backend, although that'll require specifying it for each plot() call instead of relying on the global default (...which feels reasonable if you want to enforce typing in first place).

Example to demonstrate how typing could work for the interface

import typing
import plotly.express as px
import matplotlib.pyplot as plt
from matplotlib.figure import Figure as MatplotlibFigure
from plotly.graph_objects import Figure as PlotlyFigure

class MatplotlibArguments(typing.TypedDict, total=False):
    alpha: float

class PlotlyArguments(typing.TypedDict, total=False):
    log_x: bool
    log_y: bool

@typing.overload
def plot(backend: typing.Literal["matplotlib"], x: list[int], y: list[int], **kwargs: typing.Unpack[MatplotlibArguments]) -> MatplotlibFigure:
    ...

@typing.overload
def plot(backend: typing.Literal["plotly"], x: list[int], y: list[int], **kwargs: typing.Unpack[PlotlyArguments]) -> PlotlyFigure:
    ...

def plot(backend: str, x: list[int], y: list[int], **kwargs: typing.Any) -> PlotlyFigure | MatplotlibFigure:
    if backend == "matplotlib":
        fig = plt.figure()
        plt.scatter(x, y, **kwargs, ax=fig.axes)
        return fig
    elif backend == "plotly":
        fig = px.scatter(x=x, y=y, **kwargs)
        return fig
    else:
        raise ValueError()

plot("matplotlib", x=[1, 2], y=[3, 4], alpha=0.5)
plot("plotly", x=[1, 2], y=[30, 400], log_y=True)

MarcoGorelli · 2025-01-24T11:47:11Z

thanks @etrotta

although that'll require specifying it for each plot() call instead of relying on the global default (...which feels reasonable if you want to enforce typing in first place).

Not totally sure about this, it would slightly detract from ergonomics if people have to do df.plot.px.line each time. Though arguably one compromise could be:

df.plot.line is typed to return Any - this is shorter to type, and maybe more useful for EDA
df.plot.px.line is typed, and useful for IDE work where people are more likely to tolerate 3 extra key strokes in exchange for better typing

deanm0000 · 2025-01-24T17:19:34Z

As long as you're not chaining extra methods then df.plot.line->Any should be fine. It's only if you're doing something like

df.plot.line(...).encode(...) that the typing is important since that's specific to altair.

I really don't like typing the backend as a string for the overload, it just feels weird and is more typing than df.plot.px.line

It could be that df.plot.line returns Any but if you need the static typing (for any reason but in particular chaining methods) you could also do df.plot.XX.line where XX is shorthand for the backend.

kszlim · 2025-01-24T18:18:01Z

A potentially controversial take, but maybe exposing the plotting backends at a top level namespace could be fine.

df.altair.line(...).encode(...)
df.px.line(...)

deanm0000 · 2025-01-24T19:06:35Z

A potentially controversial take, but maybe exposing the plotting backends at a top level namespace could be fine.
df.altair.line(...).encode(...)
df.px.line(...)

that would be my preference too but I can see how it'd get too cluttered between plotly, altair, matplotlib, etc and so I assume that's a non-starter.

deanm0000 · 2025-01-25T23:27:26Z

This is more a brainstorm than a commitment to follow through but I did this #20904

MarcoGorelli added the A-api Area: changes to the public API label Jan 24, 2025

nameexhaustion added the python Related to Python Polars label Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable plotting backends #20885

Configurable plotting backends #20885

MarcoGorelli commented Jan 24, 2025 •

edited

Loading

deanm0000 commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

MarcoGorelli commented Jan 24, 2025

etrotta commented Jan 24, 2025

MarcoGorelli commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

kszlim commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

deanm0000 commented Jan 25, 2025

Configurable plotting backends #20885

Configurable plotting backends #20885

Comments

MarcoGorelli commented Jan 24, 2025 • edited Loading

deanm0000 commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

MarcoGorelli commented Jan 24, 2025

etrotta commented Jan 24, 2025

MarcoGorelli commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

kszlim commented Jan 24, 2025

deanm0000 commented Jan 24, 2025

deanm0000 commented Jan 25, 2025

MarcoGorelli commented Jan 24, 2025 •

edited

Loading