[Proposal] [Vis Builder] Aggregation Persistence in VisBuilder #3482

abbyhu2000 · 2023-02-22T00:46:14Z

As the result of this research task on aggregation persistence for Vis Builder: #2900 (comment), I propose the following:

Proposal: Metric data to metric data, bucket data to bucket data

After implementing global query persistence and app persistence, Vis Builder should also be able to persist aggregational values across compatible visualization types, and ideally between incompatible visualizations to a possible degree.

All aggregation schemas are divided into two categories: metric and bucket. Metric field means the data is numerical, and bucket field means the data is categorical. Since numerical and categorical data tend to serve different purposes in a visualization, another approach is to map all the metric field to metric field, bucket field to bucket field.

Implementation idea:

Since each schema field has a property group, and it will either be AggGroupNames.Metrics or AggGroupNames.Buckets . We can collect a list of aggregation that belongs to metrics group and another list of bucket group, and map them to the new visualization type’s metrics group and bucket group.

schemas: new Schemas([
          {
            group: AggGroupNames.Metrics,
            ...
            min: 1,
            max: 3,
          },
          {
            group: AggGroupNames.Buckets,
            ...
            min: 0,
            max: 1,
          },

export const AggGroupNames = Object.freeze({
  Buckets: 'buckets' as 'buckets',
  Metrics: 'metrics' as 'metrics',
  None: 'none' as 'none',
});

Pros & Cons:

Pros: The rules are simple to follow and it is scalable since all schema fields will belong to either one of the group.
Cons:
- Some aggregation mappings might not make sense when switch to a new visualization type. It might introduce confusing user experience.
- There need to be further mappings rules introduced since each schema might have multiple metrics group and multiple bucket groups. Each metric or bucket group might have different bounds for min and max number as well. We need to define a rule on the order of mappings, and what happened if we there are more fields than what can be mapped.

Implementation reason:

Since metric field is mostly for displaying numerical data, and bucket field is mostly for separating data into groups depending on how a visualization graph can be split up, i think it makes sense to map the aggregations to the fields according to their functionalities. For the ones that previously in a metric group, the user’s intent is probably to just display those data against some type of units. So if we map them into a new metric group in another visualization type, those data will still be displayed but just in a different format. For the ones that previously in a bucket group, the user’s intent is probably to break the global data into separate groups and observe if there will be any patterns existing in each group. If we map those into a new bucket group, we are still following the user’s intent of separating global data into groups.

Mapping rules:

Collect a list of aggregations that are in metric group, and a list of aggregations that are in bucket group.
For aggregations that previously belonged to metric group
- First check if there are any new metric fields that have the same name. If so, mapping them to that new metric fields, drop the ones that exceed the max count.
- Second, for the metric fields that do not have same name, start adding them to the metric field that have the most max count allowed, and drop the ones that can no longer be mapped to any metric field.
For aggregations that previously belonged to bucket group,
- First check if there are any new bucket fields that have the same name. If so, If so, mapping them to that new metric fields, drop the ones that exceed the max count.
- Second, for the bucket fields that do not have same name, start adding them to the new bucket field that have the most max count allowed, and drop the ones that can no longer be mapped to any bucket field.

Mapping example:

Here is an example for Bar:
Y-axis: Unique count of flight delay + Unique count of flightTimeHour
X-axis: timestamp per hour
Split series: day of week/descending
Split chart: flight delay/descending

If we switch from Bar to Line chart:
Y-axis: Unique count of flight delay + Unique count of flightTimeHour
X-axis: timestamp per hour
Split series: day of week/descending
Split chart: flight delay/descending
Radius: None

If we switch from Bar to Table vis:
Metric: Unique count of flight delay + Unique count of flightTimeHour
Split rows: timestamp per hour + day of week/descending + flight delay/descending
Split table in rows:
Split table in columns:

If we switch from Bar to Metric:
Metric: Unique count of flight delay + Unique count of flightTimeHour
Split groups: timestamp per hour + day of week/descending + flight delay/descending

UI/UX proposal:

To avoid over-engineering and introducing confusing user flow, I propose that we should keep the mapping rule simple and scalable, with the addition of giving users option to either have this aggregation persistence feature or not.

Persist on default and remove the popup window ‘Change visualization type’.
Persist on default, and have a mechanism/button for users to reset the page.
Add a toggle button on the Vis Builder page to let users indicate either to have this feature on or off.
On the pop up window(as shown below) after user switch the visualization type, add another button or toggle saying Change type and persisting current aggregations.

The text was updated successfully, but these errors were encountered:

abbyhu2000 · 2023-02-22T00:49:09Z

@KrooshalUX Could you please provide some insight on this? For the UI/UX section of the proposal, if we are going to implement the aggregation persistence on the vis builder page when we switch visualization type, what do you think we should do on the UI and user experience?

abbyhu2000 · 2023-02-22T00:51:01Z

@joshuarrrr @ashwin-pc @kavilla @ananzh Could you guys provide your insights on the above proposed mapping rules?

ashwin-pc · 2023-02-22T10:17:56Z

I like the proposed solution here. Something you might want to play around with to see if it makes more sense is:

Second, for the metric/bucket fields that do not have same name, start adding them to the metric/bucket field that have the most max count allowed, and drop the ones that can no longer be mapped to any metric/bucket field.

I wonder if going by the order in the schema would be preferable here instead of the max count, since its sometimes possible that the max count could be high for a schema but its a less important breakdown than the one higher up in the list

abbyhu2000 · 2023-02-23T19:04:11Z

I like the proposed solution here. Something you might want to play around with to see if it makes more sense is:

Second, for the metric/bucket fields that do not have same name, start adding them to the metric/bucket field that have the most max count allowed, and drop the ones that can no longer be mapped to any metric/bucket field.

I wonder if going by the order in the schema would be preferable here instead of the max count, since its sometimes possible that the max count could be high for a schema but its a less important breakdown than the one higher up in the list

As i implement, i do think mapping by order make more sense. Here are some more questions: @ashwin-pc

Should we assume, for now, there will only be two groups, metric and bucket? I think we can because all fields should only belong to either one of the group.
When new vis type are added, since we implement the persistence mapping by order, should we make some rules for schema developer to follow? For example, add a readme stating
- define metric schema fields before the bucket schema fields
- define more important schema fields before the less important ones

abbyhu2000 · 2023-02-23T19:08:01Z

Updated proposing mapping rules:

Collect a list of aggregations that are in metric group, and a list of aggregations that are in bucket group.
For aggregations that previously belonged to metric group
- Mapping metric fields according to their order in the schema; first one from the old vis mapped to first one from the new vis, second to second.. If they have different max count, drop the additional ones that can not be mapped. (Instead of mapping the additional one to the next available metric fields, i think it makes more sense to drop them because i assume each field serve their own purpose, so we should just map all the aggregations within one field strictly to another one field)
- If there are more metric fields from the old vis than the new vis, drop all the aggregations from the additional metric fields
For aggregations that previously belonged to bucket group,
- Mapping bucket fields according to their order in the schema; first one from the old vis mapped to first one from the new vis, second to second..
- If they have different max count, drop the additional ones that can not be mapped
- If there are more bucket fields from the old vis than the new vis, drop all the aggregations from the additional bucket fields

abbyhu2000 · 2023-02-24T17:17:19Z

Examples:

Among Histogram, and Line and Area are pretty straight forward. Below are some more confusing aggregation persistence involving Metric and Table:

If Table vis type have aggregations such as:
Metric: A, B, C, D, E
Split Rows: F, G, H, I, J, K, L, M, N
Split table in rows: O
Split table in columns: P

If we switch to Metric:
Metric: A, B, C, D, E
Split group: F

If we switch to Area:
Y-axis: A, B, C (drop D and E since Y-axis is the only metric group and the max count is 3)
X-axis: F
Split series: G, H, I
Split chart: J

If we switch to Line:
Y-axis: A, B, C (drop D and E since Y-axis is the only metric group and the max count is 3)
X-axis: F
Split series: G, H, I
Split chart: J
Dot size: K

Here we assume that users has entered their aggregations in the order of the schema, which means they finish entering aggregations for Y-axis, then move to the next one X-axis. Maybe we also need to inform users that for best persistence experience, inserting their aggregations in the order of the schema fields? @ashwin-pc

ashwin-pc · 2023-03-03T22:17:29Z

@abbyhu2000 I'd suggest just documenting this in the code as a comment since this is important only to the visualization type authors

ashwin-pc · 2023-03-14T07:47:46Z

@abbyhu2000 can we close this issue now?

abbyhu2000 added discuss proposal vis builder labels Feb 22, 2023

abbyhu2000 self-assigned this Feb 22, 2023

github-actions bot added the untriaged label Feb 22, 2023

ashwin-pc removed the untriaged label Feb 22, 2023

abbyhu2000 mentioned this issue Feb 23, 2023

[Vis builder] Persist data when switching visualization types #2763

Closed

4 tasks

abbyhu2000 mentioned this issue Feb 24, 2023

[Vis Builder] Add metric to metric, bucket to bucket aggregation persistence #3495

Merged

8 tasks

abbyhu2000 mentioned this issue Mar 3, 2023

[Vis Builder] Agg data persistence when switching among area, line and and histogram vis type #3157

Closed

abbyhu2000 closed this as completed Mar 31, 2023

abbyhu2000 mentioned this issue Mar 31, 2023

[Fix] VisType switching persistence and selectively show warning #3715

Merged

8 tasks

ashwin-pc mentioned this issue Apr 6, 2023

[Vis Builder] Vis Builder meta issue #1157

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] [Vis Builder] Aggregation Persistence in VisBuilder #3482

[Proposal] [Vis Builder] Aggregation Persistence in VisBuilder #3482

abbyhu2000 commented Feb 22, 2023

abbyhu2000 commented Feb 22, 2023

abbyhu2000 commented Feb 22, 2023

ashwin-pc commented Feb 22, 2023

abbyhu2000 commented Feb 23, 2023 •

edited

Loading

abbyhu2000 commented Feb 23, 2023 •

edited

Loading

abbyhu2000 commented Feb 24, 2023 •

edited

Loading

ashwin-pc commented Mar 3, 2023

ashwin-pc commented Mar 14, 2023

[Proposal] [Vis Builder] Aggregation Persistence in VisBuilder #3482

[Proposal] [Vis Builder] Aggregation Persistence in VisBuilder #3482

Comments

abbyhu2000 commented Feb 22, 2023

Proposal: Metric data to metric data, bucket data to bucket data

Implementation idea:

Pros & Cons:

Implementation reason:

Mapping rules:

Mapping example:

UI/UX proposal:

abbyhu2000 commented Feb 22, 2023

abbyhu2000 commented Feb 22, 2023

ashwin-pc commented Feb 22, 2023

abbyhu2000 commented Feb 23, 2023 • edited Loading

abbyhu2000 commented Feb 23, 2023 • edited Loading

abbyhu2000 commented Feb 24, 2023 • edited Loading

ashwin-pc commented Mar 3, 2023

ashwin-pc commented Mar 14, 2023

abbyhu2000 commented Feb 23, 2023 •

edited

Loading

abbyhu2000 commented Feb 23, 2023 •

edited

Loading

abbyhu2000 commented Feb 24, 2023 •

edited

Loading