Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for timeseries charts #3214

Merged
merged 9 commits into from
Mar 16, 2023
Merged

Add support for timeseries charts #3214

merged 9 commits into from
Mar 16, 2023

Conversation

jreyesr
Copy link
Contributor

@jreyesr jreyesr commented Mar 11, 2023

This PR adds support for time-based data to charts (specifically column, line and area charts)

Closes #1389

Problem

Currently, all charts assume that their X axis (the independent variable) is categorical. This works well, for example, to plot a count of servers for each AWS region, since regions are categories. However, when plotting for example logins by day, this does not work so well: for example, see below, a count of events per month. Note that March is missing, since it had no data. The same happens for July, and September to November. However, the categorical nature of the axis hides those gaps.
image
Time-based data would be better represented as this:
image

Scope of changes

This PR edits the Chart element in the frontend, and adds a new parameter type to the chart.axes.x block in HCL.
The Chart element gets support for the time value in the new parameter, which enables a new code path when passed.

Backwards compatibility

Barring bugs, this change should be completely backwards-compatible. This is because the new parameter is optional and, if not provided, none of the new code runs. Thus, old dashboards will continue working as before, but new dashboards can make use of the new parameter. In effect, the new functionality is opt-in and thus shouldn't alter any existing dashboards.

Example

This is quite long, click to expand (Note: This data comes from an unreleased Bitcoin plugin of mine, it's the easiest way for me to generate time-based data)

This query displays the funds that came into and out of a certain Bitcoin wallet, grouped by day: "On day X, wallet A received a total of Y satoshis and sent a total of Z satoshis"

chart {
  sql = <<EOQ
    select
      to_char(date_trunc('day', time::date), 'YYYY-MM-DD'),
      sum(case when wallet_balance > 0 then wallet_balance else 0 end) / 1e8 as "Received",
      sum(case when wallet_balance < 0 then wallet_balance else 0 end) / 1e8 as "Sent"
    from blockchain_transaction
    where wallet = $1
    group by 1;
  EOQ

  width = 12
  args =  [self.input.wallet.value]

  axes {
    x {
      type = "time"
    }
  }
  series "Received" {
    color = "green"
  }
  series "Sent" {
    color = "red"
  }
}

A snippet of the data returned:

time Received Sent
2022-10-26 14.05 0
2022-10-27 6.58213227 -0.05
2022-10-28 52.83391916 -20.58213227
2022-10-29 0 -10
2022-10-30 0 -42.83391916
BLUE LINE IS HERE
2023-01-07 7.34957389 0
2023-01-08 6.41791129 -7.34957389
2023-01-09 0 6.41791129

Old result (with no time-series behavior)
image
At the blue line there's a jump between 2022-10-30 and 2023-01-07, but that is not displayed.

New result (with time-series activated)
image
Gaps between data are now respected, and the X axis has the notion of hierarchical time (see, for example, how it places the 2023 label right when 2023 starts, and from there it just displays the months, even though data comes once per day)

@jreyesr jreyesr marked this pull request as ready for review March 11, 2023 15:13
@MichaelBurgess
Copy link
Contributor

@jreyesr thanks for the PR! I was going to leave a comment asking if you could add some stories, but I see you pushed some changes for that now anyhow.

For the stories, could we also look to add single time-series stories (e.g.SingleTimeSeries) and also a MultiTimeSeriesGrouped for the relevant chart types that support grouping? So column chart would be a good example of one that supports grouping for multi-series - see image below:

image

@jreyesr
Copy link
Contributor Author

jreyesr commented Mar 13, 2023

@MichaelBurgess More stories have been added. Current state is:

  • Column chart: Single time series, multiple time series (grouped), multiple time series (stacked)
  • Area chart: Single time series, multiple time series (grouped), multiple time series (stacked)
  • Line chart: Single time series, multiple time series (grouped)

@MichaelBurgess
Copy link
Contributor

MichaelBurgess commented Mar 13, 2023

@jreyesr thanks! Interesting qu linked to the below - when you first discovered that we didn't support sparse time series data, did you throw some time series data at Steampipe and assume that it would automatically handle it, or would you have expected to have to set some kind of config to indicate the behaviour you wanted?

@e-gineer I was speaking with @johnsmyth before and he put forward the idea that perhaps we should just automatically treat the data as time series if the first column is one of a fixed set of pg time data columns?

This could work quite nicely and if you didn't want that behaviour, you could simply cast your column back to something like TEXT?

We might need to consider some formatting options for the auto-time series approach, but even without them we would cover @jreyesr's case and remove the need to add more config (and bring with it the fun of naming things!). We'd also need to check any impact on our existing mods that use any time series data (the proposed default might be desirable, but would be worth checking).

Would be keen to know people's thoughts on this...

@judell
Copy link
Contributor

judell commented Mar 13, 2023

we should just automatically treat the data as time series if the first column is one of a fixed set of pg time data columns?

FWIW I think that's how Metabase does it.

@jreyesr
Copy link
Contributor Author

jreyesr commented Mar 13, 2023

when you first discovered that we didn't support sparse time series data, did you throw some time series data at Steampipe and assume that it would automatically handle it, or would you have expected to have to set some kind of config to indicate the behaviour you wanted?

I began by just throwing data at it, ensuring that the first column was of DATETIME type. When I found out that it didn't keep the spaces in data, I checked to see if there was a configuration flag to enable it, and then I found out that all the examples used categorical data.

@MichaelBurgess
Copy link
Contributor

Hey @jreyesr sorry for the slight delay in replying. We had a bit of a discussion on this one and we really like the idea of making the time series switching automatic - so we'd check the data time of the first column and if we detect a timestamptz column then use the time series logic, else assume it's a category.

This avoids us needing to add more config and we believe will feel intuitive to users - sounds like you expected it to work this way to begin with, so that's good confirmation this is the right way to go.

Are you happy to have a go at updating the PR with that approach? Thanks!

@jreyesr
Copy link
Contributor Author

jreyesr commented Mar 14, 2023

@MichaelBurgess Sure! I'll take the conversation back to Slack, since I expect to have some questions, so as not to clutter the issue.

@MichaelBurgess
Copy link
Contributor

@jreyesr added a v.minor Typescript typing suggestion change - but things are checking out well from some local testing - so I think after that small tweak we can proceed.

I think RE refactoring the code, we could consider that on a separate change?

@jreyesr jreyesr changed the title Add suport for timeseries charts Add support for timeseries charts Mar 15, 2023
Co-authored-by: Mike Burgess <mike@turbot.com>
@jreyesr
Copy link
Contributor Author

jreyesr commented Mar 16, 2023

Done!
RE the refactoring: sure, let's keep this PR focused. Especially since, AFAICT, most of the FE has no tests, so adding them would balloon into a fairly big change.

@MichaelBurgess
Copy link
Contributor

MichaelBurgess commented Mar 16, 2023

There are some in certain areas - yarn test will run them - they're hooked into the CLI release process also.

I've been introducing tests in areas as-and-when there's been a need to refactor. Over time we'll get the coverage up.

@MichaelBurgess MichaelBurgess merged commit ffd1a38 into turbot:main Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

more display features for time series data in charts
3 participants