-
-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large timeseries #1205
Large timeseries #1205
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this guide is useful in its own, I believe it needs to be significantly re-worked before being integrated in the docs. It feels to me it's been written as a standalone guide. Actually, I believe it could be pretty easily turned into a nice blog post!
- The guide should be more integrated into the docs, probably moving some of its content to the Time Series Data guide.
- The guide focuses too much on Bokeh, hvPlot supports Matplotlib and Plotly too.
- The guide could link more to resources in HoloViews and Datashader.
More specifically:
The old way: Bokeh's custom Canvas rendering
andWebGL: new baseline for timeseries plotting
: I'm not sure we should mention how things used to be? Ideally, we'd have a guide specific to Bokeh like Plotting with Bokeh in HoloViews' docs, that mentions WebGL.Datashader rasterizing
: I just realized explaining Datashader is difficult, I don't know how many notebook users will understand this sentence: Datashader works in a different way, rendering the data into a frame buffer on the server, and then sending that buffer to the web browser rather than the individual data points. We're also defining Datashader in multiple places in hvPlot's docs. Ideally again, we could have a "Large data" guide that would be the only place where we would define and explain Datashader, and link to it from other places. I also find the guide isn't extended enough on anti-aliasing, I bet most users aren't familiar with it and need more explanation.Minimap
: What should be the main reference place to introduce the RangeToolLink in hvPlot's docs? Currently it's only used in the OHLC guide. Should it be in the yet-to-me-made Plotting with Bokeh guide? It seems to be the minimap approach could already be introduced in the Timeseries Data guide.
Reviewing the published page https://holoviz-dev.github.io/hvplot/user_guide/visualizing_large_timeseries.html rather than the source code:
Title of notebook needs to be capitalized to match others and have a reasonable title. hvPlot is inherently a plotting library, so "visualizing" seems redundant. Just "Large_Timeseries", maybe?
When pages are built for our website, it uses the default Datashader image size. The default image size is intentionally set to a low value to avoid generating a large image that is then thrown away in interactive usage, updating to the actual display resolution via a RangeXY callback. Here, because the callback is never invoked, the image is rendered at a very low resolution, which looks bad on the website. I think the images can be improved by including a cell like this early in the notebook: from holoviews.operation.resample import ResampleOperation2D
ResampleOperation2D.width=1200
ResampleOperation2D.height=500
Presumably this note should be omitted, and an issue opened instead.
The example shows no data by default; presumably we should put some sort of initial range in there that causes data to display when exported to HTML? The instructions also don't seem to match the plot; there's no grey box visible, and once you pan to find one, it's not a small rectangle but something larger than the plot, which doesn't seem right. Plus panning and zooming in the bottom plot make it very easy to get lost; I would think that the minimap should not have any y axis panning and no zooming, just x panning. It should be hard to shoot yourself in the foot or get lost.
Yes, this was a standalone guide, and we decided that hvplot was where it should end up. I agree it will make a nice blog post when we are done, but it should also have a permanent home in our docs so that people can figure out the best way to deal with their large timeseries data.
I could be convinced otherwise, but my first guess would be that the Time Series guide should lose its LTTB section and instead it should have a section at the end suggesting that people look at this separate guide if they have large timeseries or want to look at many of them together. It's a lot of content already and I don't think it's relevant to people with small timeseries.
That's a general tension in the hvPlot docs that I believe remains unresolved -- how do we show how the backends differ, as well as how the various data sources differ? I don't think this one is particularly different in that respect, but if it is, it can have an explicit statement that these examples focus on Bokeh but in some cases similar functionality is available for the other backends.
Maybe don't say it's the old way, then, but just mention that it's an option and that it's not recommended any more.
I probably wrote that; any suggestions on how to make it clearer?
Sounds good. I hear you volunteering to write that! :-)
I think we can put in a link that explains it.
Good question! It seems to me that the minimap is primarily useful for large timeseries, and so to me it belongs here, in the large timeseries notebook. @hoxbro, can you link to the issues that detail the remaining warts and areas for improvement in this notebook? I think you mentioned that they existed but I don't see how to get to them from here. |
I'm starting to address the points raised above. I've collected the tasks in a board |
I've added it, along with an explainer admonition, but it's a bit awkward to add the following to the notebook. Ideally, we could either run this in the CI workflow somehow or use a hidden cell (not sure how). from holoviews.operation.resample import ResampleOperation2D
ResampleOperation2D.width=1200
ResampleOperation2D.height=500 |
@droumis I made a couple of small changes to attempt to hide the cell we were talking about the other day (the one setting the resampling dimensions). This is usually supported by MyST-NB by adding the I added a small comment to make it clear there's something special with this cell, it's not so obvious otherwise when you work from JupyterLab/Notebook. I'm planning to release hvPlot 0.9.1 today, how do you feel about this PR? It seems to me it's in a much better state and it could go as in. I'm even fine having some sections marked as |
re: hidden cell, that's great to see, @maximlt! I think that will help us in several other places across holoviz docs. re: merging now, let's wait for the Bokeh 3.4 and the next HoloViews release, as this notebook requires Bokeh #13603, and benefits from HoloViews #6030. I'm also actively working on the things marked WIP. I'd also really like to see auto-ranging for multiple lines fixed before this is released, which @jlstevens will hopefully have time to address early Jan. |
I processed some real spike waveform data to create a new datashader section on plotting many lines per multiple categories. As far as I could figure out, until we resolve the relevant data format issues in HoloViews, the simplest way for hvPlot is to add NaN separators to a dataframe, so I've done that step prior to upload the data and just explained it in the notebook. It's up on the dev website. Unless there are any further comments, I think we are just waiting on Bokeh 3.4 and the next HoloViews release to merge this PR. If autoranging, ds inspections, or this nan issue gets resolved before then - great, but those can also be follow-ups. |
@droumis , that new plot looks great! So nice to see that after years of just imagining it. :-) Are the new issues you found when doing that now part of https://github.com/orgs/holoviz/projects/14/views/2 ? If not please add them there. We've come up with a nice, comprehensive set of issues to address, now we just need to address them! |
|
superseded by #1302 |
This adds a notebook that explains the different ways of working with large time-series datasets with holoviz