Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce: proportion for counts #399

Closed
Fil opened this issue May 14, 2021 · 2 comments
Closed

reduce: proportion for counts #399

Fil opened this issue May 14, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@Fil
Copy link
Contributor

Fil commented May 14, 2021

As @yurivish remarks in https://observablehq.com/d/0d9e55233e524500 the "proportion" reducer doesn't get an automatic label if we don't pass x, and crashes if we pass it a string channel x (since the sum of strings is 0).

@Fil Fil added the enhancement New feature or request label May 14, 2021
@yurivish
Copy link
Contributor

yurivish commented May 14, 2021

With the "proportion-count" approach we would also need "proportion-facet-count" and things start feeling a bit combinatorial. Though maybe we don't need combinations other than those two?

Implementation-wise I was imagining a switch on the type of the first non-null nonundefined element, then computing either the sum or the number of elements in x, similarly to how type inference is talked about in the documentation:

For simplicity’s sake, Plot assumes that data is consistently typed; type inference is based solely on the first non-null, non-undefined value.

Fil added a commit that referenced this issue May 18, 2021
mbostock pushed a commit that referenced this issue Aug 1, 2021
@mbostock
Copy link
Member

mbostock commented Aug 1, 2021

In the same way that the sum reducer “crashes” (i.e. produces NaN output) when given non-numeric input values to sum, I think it’s desirable that the proportion reducer behave the same way when given input values that are non-numeric. In other words, just as the sum reducer does not (and should not) treat all the input values as unity if the input values are non-numeric, nor should the proportion reducer accept non-numeric values when an input is specified. My recommendation in the motivating example is simply to remove the x: 'key' input to the proportion reducer:

Plot.stackX(
  Plot.groupZ(
    { 
      x: 'proportion', 
      text: 'first'
    },
    {
      fill: 'key',
      text: 'key',
      order: 'value',
      insetRight: 1
    }
  )
)

Regarding the automatic label, you shouldn’t need to change the semantics of the operation (i.e., switch from an unweighted to a weighted proportion) in order to pass through the label. There’s competition for the implicit label here: there’s the reducer (here proportion, which has the implicit label “Frequency”) and the group definition (here using the “key” column from the fill channel). I don’t think it’s appropriate to label the x-axis “key” here, since it represents frequency (i.e., the proportion), not the key. The data is grouped by key, but that encoding only exists in the fill and labels for the bars.

So, my vote is to close this issue (but please continue the discussion if you’re not convinced) and #410. But I’m going to open a separate issue to throw an error when you try to pass a non-numeric channel to something that requires numeric input, such as the proportion and sum reducer, so that Plot makes clearer what’s going wrong.

Ref. #410 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants