Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jupyter output: Use mime types for mime bundles instead of publishing everything as text/html #6435

Open
flying-sheep opened this issue Nov 1, 2024 · 5 comments
Assignees

Comments

@flying-sheep
Copy link

flying-sheep commented Nov 1, 2024

Is your feature request related to a problem? Please describe.

I’d like to use multiple backends and/or multiple image formats per backend to create a notebook that displays nicely in different environments.

E.g. GitHub doesn’t show any previews for HoloViews-created plots, probably because it doesn‘t support HTML.

Describe the solution you'd like

  1. the display routines need to be changed so images are published as image mime types with metadata instead of HTML. I consider this part a bug. A routine called display_svg has to publish a mime bundle with the key image/svg+xml.

  2. there should be a way to have multiple backends active, e.g. matplotlib for static output (that renders on GitHub) and bokeh for interactive frontends.

  3. there should be a way to configure backends to render multiple outputs, which already happens for some. E.g. an interactive frontend could publish text/html for compatibility and application/vnd.vegalite.v4+json for frontends that support vegalite4 in a more integrated way.

    but it would also be nice to configure e.g. the matplotlib backend to render both a PNG and a SVG, and leave it to the user’s configured mime preferences to pick which one to display.

Describe alternatives you've considered

Manually hacking together a solution (like pulling the image data out of the HTML payload or manually creating the image payload) is always possible, but cumbersome and shouldn’t be necessary

Additional context

https://nbformat.readthedocs.io/en/latest/format_description.html#display-data

@droumis
Copy link
Member

droumis commented Nov 1, 2024

@philippjfr , this is related to the scverse zulip chat

@philippjfr
Copy link
Member

philippjfr commented Nov 14, 2024

Thanks @flying-sheep, we designed some of the mime type stuff a long time ago so I'm using this as a chance to re-familiarize myself with the code. Before I start doing this to describe the current design I will ask some clarifying questions and concerns that cropped up when we first designed this.

I believe we were concerned about always including a bunch of mimetypes in the bundles because it seemed wasteful and expensive to render and then send outputs that wouldn't actually end up getting used. Particularly when rendering Bokeh plots, rendering PNG or SVG outputs is extremely slow and expensive, because it has to launch a headless browser and then orchestrate taking a screenshot with Selenium (or Playwright).

the display routines need to be changed so images are published as image mime types with metadata instead of HTML. I consider this part a bug. A routine called display_svg has to publish a mime bundle with the key image/svg+xml.

I think that should be possible for renderers that support generating SVGs and PNGs cheaply, but I think the reason we are publishing HTML here is for consistency and the ability to center the outputs. I could easily be convinced that centering the outputs was a silly decision and we should just align with everything else but publishing raw SVG/PNG outputs by default still only makes sense when rendering static plots, as soon as you have a HoloMap or DynamicMap, it has to be HTML anyway.

there should be a way to have multiple backends active, e.g. matplotlib for static output (that renders on GitHub) and bokeh for interactive frontends.

We've always wanted to eventually support this but we've never been able to prioritize this in any real sense and the main problem is that each rendering backend has completely separate options, so if you configure the options for bokeh the Matplotlib output may look nothing like it because the options don't translate between the backend. As a general goal this is something we should continue to strive towards.

there should be a way to configure backends to render multiple outputs, which already happens for some. E.g. an interactive frontend could publish text/html for compatibility and application/vnd.vegalite.v4+json for frontends that support vegalite4 in a more integrated way.

Agreed, currently we do something slightly weird and publish the actual content as 'text/html' and then also include an empty 'application/vnd.holoviews_exec.v0+json' value that our Jupyter mime renderers pick up on. This should probably be cleaned up.

But it would also be nice to configure e.g. the matplotlib backend to render both a PNG and a SVG, and leave it to the user’s configured mime preferences to pick which one to display.

This is possible already but as far as I can tell completely undocumented. Specifically you can set:

hv.Store.display_formats = ['png', 'svg', 'html']

When using the bokeh backend this will in fact invoke Selenium to generate a screenshot. Would have to figure out if that's at all desirable before advertising this properly.

@flying-sheep
Copy link
Author

flying-sheep commented Nov 14, 2024

Thanks for taking the time to explain all that!

the main problem is that each rendering backend has completely separate options, so if you configure the options for bokeh the Matplotlib output may look nothing like it because the options don't translate between the backend

I guess that means that users should clearly see when they’re configuring backend-specific things and when they configure backend-agnostic things. I thought that was already the case!

hv.Store.display_formats = ['png', 'svg', 'html']

Ideally one would be able to select which backend maps to which mime type.
Selenium does sound like overkill, but you do have are lightweight image rendering backends. (I’m aware that the above about options applies)

I think that should be possible for renderers that support generating SVGs and PNGs cheaply, but I think the reason we are publishing HTML here is for consistency and the ability to center the outputs. I could easily be convinced that centering the outputs was a silly decision and we should just align with everything else but publishing raw SVG/PNG outputs by default still only makes sense when rendering static plots, as soon as you have a HoloMap or DynamicMap, it has to be HTML anyway.

I don‘t like that holoviz is publishing a PNG/SVG as HTML instead of as itself. Trust the frontends to know best how they want to display static images. Just add the metadata you have:

{
    "output_type": "execute_result" | "display_data",
    "execution_count": 42,
    "data": {
        "image/png": "[base64-encoded-multiline-png-data]",
        ...
    },
    "metadata": {
        "image/png": {
            "width": 640,
            "height": 480
        }
    }
}

@philippjfr
Copy link
Member

Sorry for my delayed response, have been out for a little while.

I guess that means that users should clearly see when they’re configuring backend-specific things and when they configure backend-agnostic things. I thought that was already the case!

That actually was the case in the past and has now been lost to some extent because we found it was actually more confusing to users. Specifically the way options in HoloViews work is that we have so called plot options and so called style options. Oversimplifying a little bit plot options control things about the figure and the axes while style options control things about the artists (or glyphs as bokeh calls them). We have tried to make sure that plot options are generally consistent across backends, with certain exceptions (e.g. fig_size in Matplotlib and width/height in Bokeh/Plotly). The style options on the other hand are simply passed straight down to each plotting backend and therefore can differ significantly.

In the past you'd separately define plot and style options and therefore had a fairly clear delineation, at some point we discovered that this was more awkward for most use cases though and simply combined them. A translation layer for style options has something that we've often thought about but dismissed as potentially too error prone and complex to be worth doing. In hvPlot we got to largely paper over this problem by inheriting the pandas plotting API and performing some level of translation.

Ideally one would be able to select which backend maps to which mime type. Selenium does sound like overkill, but you do have are lightweight image rendering backends. (I’m aware that the above about options applies)

I'd be in favor of enabling svg and png static output by default in the mime bundles for Matplotlib output. We just need consensus on our end that we disable the default centering behavior, which really is the only reason we use HTML here in the first place.

@flying-sheep
Copy link
Author

Hi! I’m back from holidays now too!

We just need consensus on our end that we disable the default centering behavior, which really is the only reason we use HTML here in the first place.

Yes please find that consensus. If there’s the slightest doubt about that being the way to go, I’m happy to add a lot of strong opinions about the benefits of interoperability and passing on semantic information instead of opaque blobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants