Support different dimension permutations #5

sethaxen · 2021-06-13T17:38:36Z

The functions currently assume draws are in a single array with shape (draws, chains, params), like MCMCChains.Chains stores. We should consider how to support different permutations (could be as simple as recommending users use PermuteDimsArrays). e.g. ArviZ defaults to (chains, draws, params). Not certain about Soss's SampleChains.

The text was updated successfully, but these errors were encountered:

devmotion · 2021-06-13T18:05:17Z

The functions currently assume draws are in a single array with shape (draws, chains, params)

They either work with vectors of draws or arrays of shape (draws, params, chains).

sethaxen · 2021-06-13T20:50:00Z

How would one pass a vector of draws to e.g. ess_rhat?

devmotion · 2021-06-13T20:52:51Z

No, it's not supported by ess_rhat. All functions either work with vectors or arrays of shape (draws, chains, params), but not both.

ParadaCarleton · 2021-06-19T18:27:45Z

The functions currently assume draws are in a single array with shape (draws, chains, params), like MCMCChains.Chains stores. We should consider how to support different permutations (could be as simple as recommending users use PermuteDimsArrays). e.g. ArviZ defaults to (chains, draws, params). Not certain about Soss's SampleChains.

And I think Stan defaults to (params, chains, draws), giving us a head start on our apparent task of going through every possible permutation of indices.

devmotion · 2021-06-19T18:43:10Z

I don't think this package should support different permutations. IMO we need a clearly documented and consistent choice for such situations but then users have to permute the arrays if their data is in a different format.

And just to reiterate, not all statistics work with 3d arrays of samples. Some work just with a vector of scalar-valued samples (one parameter, one chain) and I don't think this should be changed. Also the Rstar statistic works with a matrix of samples of size (draws, params) and a corresponding vector of chain indices (this is more general than a 3d array).

devmotion · 2021-06-19T18:55:32Z

I should add that I don't think we have to stick with the current convention but it would be good to be consistent for 3d arrays. I guess the choice should be motivated by what is the most convenient and efficient layout. Since Julia uses column-major order and Python row-major, probably it differs from what one would choose in Python.

ParadaCarleton · 2021-06-21T16:57:44Z

I should add that I don't think we have to stick with the current convention but it would be good to be consistent for 3d arrays. I guess the choice should be motivated by what is the most convenient and efficient layout. Since Julia uses column-major order and Python row-major, probably it differs from what one would choose in Python.

I mean, being able to work with different permutations of indices comes as a free side effect of using something like AxisKeys.jl, which I think we should be using anyways to avoid bugs from messing up the order we're indexing in. We can always provide a free "Fallback" that assumes dimensions are ordered in some clearly specified way.

devmotion · 2021-06-21T17:00:16Z

Please no, let's just stick with generic AbstractArrays - one main motivation here is to get rid of the Chains/AxisArrays mess and just be as generic as possible.

ParadaCarleton · 2021-06-23T01:59:48Z

I think the best arrangement would be (parameters, draws, chains). Operations on chains are usually done in parallel, e.g. sampling a different chain on each core, so it's not necessary to have chains located close to each other in memory. Parameters are sampled together, so they should be located close in memory so that every time one parameter is written to memory, the next parameter can be written to memory pretty easily. Iterations should be located somewhat close to each other, but aren't always accessed together the same way that parameters from a single iteration usually are. Please correct me if I'm wrong, I'm not a computer scientist.

ParadaCarleton · 2021-06-23T02:01:33Z

No, it's not supported by ess_rhat. All functions either work with vectors or arrays of shape (draws, chains, params), but not both.

I believe ess_rhat currently uses (draws, params, chains). (An arrangement I find extremely counterintuitive, since chains and draws aren't together.)

devmotion · 2021-06-23T03:21:34Z

No, it's not supported by ess_rhat. All functions either work with vectors or arrays of shape (draws, chains, params), but not both.

I believe ess_rhat currently uses (draws, params, chains). (An arrangement I find extremely counterintuitive, since chains and draws aren't together.)

Ah yes, this is what I wanted to write and what is mentioned in the documentation. The reason for this layout is that it is the one used in MCMCChains.Chains. However, I don't know what motivated the design choice there, this was decided before I got involved in MCMCChains. Maybe @cpfiffer knows why it was preferred over (parameters, draws, chains)?

cpfiffer · 2021-06-23T14:27:21Z

I think this was just a holdover from Mamba -- it was really something I hadn't considered.

sethaxen · 2021-06-23T14:47:34Z

There are two different ways to think about this. One is to reason about what permutation users are likely to pass (which leads to reasoning about what a sensible ordering in memory would be). However, a given PPL may not deliver that ordering (e.g. MCMCChains and IIRC SampleChains). The other is to think about the permutation that would be most efficient for a given function. However, different functions might prefer different permutations, and we should be consistent.

Ideally these two would converge, but I don't know if they do.

ParadaCarleton · 2021-06-23T16:28:32Z

I think any differences in terms of actual speed/efficiency are probably pretty small -- writing MCMC samples to memory is not going to be the bottleneck for something like HMC under any reasonable set of circumstances. Because of that, I think choices of index orderings should probably be based on what users are most likely to consider intuitive/reasonable orderings. In that case, I think either of (params, draws, chains) or (chains, draws, params) are the most intuitive, since they involve going from more general to more specific, or more specific to more general. (Multiple parameters are contained in a single draw, and several draws are contained in a single chain.) The former has the advantage that it's easier to leave off indices for chains when users are only sampling from a single chain, so I propose we go with that unless anyone has any strong objections.

ParadaCarleton · 2021-06-26T01:27:45Z

Does anybody object, or should I make a pull request reordering axes like this? I've written some PSIS code that assumes this ordering for ess_rhat, and would like to know whether I should rearrange the PSIS code or the ordering for ess_rhat.

ParadaCarleton · 2021-06-30T18:40:05Z

@sethaxen @devmotion I can create a PR reordering these axes, and implementing a consistent interface that works with arrays following this standard.

In cases where the interface isn't consistent, there's usually a fix that will make the function easier to work with from a user perspective. For instance, if a function only accepts one chain, then calling it on an array of multiple chains should return a vector with the results of applying it to each individual chain. (We can, of course, keep the original function for users who want to only call it on a single chain.) The goal should be to provide a polished, ArviZ-like interface that "Just works," and lets the user pass a single object to each function, rather than having to figure out how they need to permute, cut up, or use eachslice on their arrays to get the diagnostic they're looking for. This has already been done for a handful of functions, but not all of them.

devmotion · 2021-06-30T19:22:17Z

In my opinion, we should not "unify" functions that operate on single chains by moving them to an interface that works on 3d arrays of multiple chains with multiple parameters. It just makes it more complicated for downstream packages that use a different layout to use these diagnostics (e.g. vectors of chains of possibly different length, chains based on StructArrays etc. - in particular the StructArray approach is a longstanding issue/idea that we want to explore as an alternative to MCMCChains.Chains) and I don't see any immediate advantages even for the 3d case - you can always apply the function on the different slices of the array and e.g. even in this case sometimes you might want to pool different chains.

In general, I think any changes of the input layouts or dimensions should be left for a 0.2 or even more distant release but not included in the initial 0.1 version since otherwise it will just be more difficult to replace the diagnostics in MCMCChains with this package and the transition will be less clean (it would require many additional changes in TuringLang/MCMCChains.jl#310). So unfortunately currently I would not approve any such PR that changes the input layout of any diagnostics but I think we should consider if we want to change the default permutation of 3d arrays at a later stage.

ParadaCarleton · 2021-06-30T22:08:57Z

Sure, we can handle this later if you want.

As for the other thing, I'm not suggesting that we replace the existing methods or get rid of them, just that we provide additional methods that handle everything for users who use the "default" arrangement, which I expect would be a three-dimensional array. Not every method needs to have more complicated, but every method should accept a 3d array as input (assuming it makes any kind of sense for it to accept that array). If someone wants to work with a matrix and a vector of chain indices, they can use the methods we already have without being bothered by the fact that we have another one that works on arrays. On the other hand, users who already have their data stored in an array shouldn't have to spend even more time cleaning their data than they already do. Figuring out how to e.g. disassemble an array and convert it into a matrix representation with a bunch of chain indices is going to be a pretty big waste of time for users; why not just have it work out of the box for them?

ParadaCarleton mentioned this issue Jul 3, 2021

Comparing Turing Performance with Different Permutations of MCMCChains Indices TuringLang/MCMCChains.jl#314

Closed

sethaxen mentioned this issue Aug 8, 2022

deprecating MCMCDiagnostics.jl in favor of this package #41

Closed

sethaxen mentioned this issue Aug 22, 2022

Changing the default dimension order arviz-devs/InferenceObjects.jl#8

Closed

sethaxen mentioned this issue Nov 18, 2022

Changes to dimension ordering #49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support different dimension permutations #5

Support different dimension permutations #5

sethaxen commented Jun 13, 2021

devmotion commented Jun 13, 2021

sethaxen commented Jun 13, 2021

devmotion commented Jun 13, 2021

ParadaCarleton commented Jun 19, 2021

devmotion commented Jun 19, 2021

devmotion commented Jun 19, 2021

ParadaCarleton commented Jun 21, 2021

devmotion commented Jun 21, 2021

ParadaCarleton commented Jun 23, 2021

ParadaCarleton commented Jun 23, 2021 •

edited

Loading

devmotion commented Jun 23, 2021

cpfiffer commented Jun 23, 2021 •

edited

Loading

sethaxen commented Jun 23, 2021

ParadaCarleton commented Jun 23, 2021 •

edited

Loading

ParadaCarleton commented Jun 26, 2021

ParadaCarleton commented Jun 30, 2021 •

edited

Loading

devmotion commented Jun 30, 2021

ParadaCarleton commented Jun 30, 2021

Support different dimension permutations #5

Support different dimension permutations #5

Comments

sethaxen commented Jun 13, 2021

devmotion commented Jun 13, 2021

sethaxen commented Jun 13, 2021

devmotion commented Jun 13, 2021

ParadaCarleton commented Jun 19, 2021

devmotion commented Jun 19, 2021

devmotion commented Jun 19, 2021

ParadaCarleton commented Jun 21, 2021

devmotion commented Jun 21, 2021

ParadaCarleton commented Jun 23, 2021

ParadaCarleton commented Jun 23, 2021 • edited Loading

devmotion commented Jun 23, 2021

cpfiffer commented Jun 23, 2021 • edited Loading

sethaxen commented Jun 23, 2021

ParadaCarleton commented Jun 23, 2021 • edited Loading

ParadaCarleton commented Jun 26, 2021

ParadaCarleton commented Jun 30, 2021 • edited Loading

devmotion commented Jun 30, 2021

ParadaCarleton commented Jun 30, 2021

ParadaCarleton commented Jun 23, 2021 •

edited

Loading

cpfiffer commented Jun 23, 2021 •

edited

Loading

ParadaCarleton commented Jun 23, 2021 •

edited

Loading

ParadaCarleton commented Jun 30, 2021 •

edited

Loading