-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weights in multinomial #142
Comments
I think that you're raising a good point, but in this case the weight array represents the probabilities of a multinomial distribution, so I think the correct 'contract' for this method to enforce would be to check that the weights are between 0 and 1 (and sum to 1). For comparison I've been thinking a lot recently about making sure that the type interface is as accurate as possible to the underlying mathematical representation (without going "full" Haskell on the library). How can we make this clearer?
|
hmm, i do see your point of enforcing the input to be a type that it naturally should be. i was more thinking about efficiency. now, the client has to generate the weights, and presumably, has to normalize them to 1 to comply with the interface. then the function does the same computation again. waste of effort. i guess one could invent a |
Those are good points, let me ruminate on a solution. I think that a |
@nilsbecker Check out the prob_module branch: Specifically there is now a method There is probably a bit more work of adding methods to the main
@struktured Any thoughts? |
I have thoughts but only somewhat cohesive. Maybe you'll find some of it useful. I like the idea of a probability module. I've implemented a few over the years in other languages, although I'd hesitate to say I've ever seen a perfect version of one in code yet. My favorite representation is typically
I understand why making it polymorphic could be troublesome for consuming apis but I really do enjoy a strong coupling between datatypes and probabilities. It's also very human readable. I rather see a Sorted by likelihood is probably too expensive for large arrays, but maybe a convenience function to copy / transform the existing weight vector into a sorted structure when the user needs to would be helpful. Also another thing to consider I as write this is that what you defined is a discrete multinomial probability vector, but a probability itself could be some sort of density function, if we want to pedantic about it. Maybe probability should be an abstract sig for which multinomial is one such implementation? Preserving the scale value means keeping the weights always normalized, but maintaining one extra constant to rescale the weight vector into frequency counts. The pro of this is the vector is always normalized and thus cheap to query. The con is updating the distribution is more expensive (eg. rescale to frequencies, add the new frequency counts, unscale) and capable of numerical instability. And this leads to final thought on this- should we have different "views" of a distribution, with types to indicate this? You could think of one representation leading itself well to mutation- imagine a distribution being repeatedly updated every time some event is fired, but that you want to take a snapshot of the distribution at some time slice and normalize it. Eg. you query a "distribution" for a (normalized?) "probability"- the distribution is the unnormalized entity and the probability is a normalized view. Regarding map2/fold2, I also really think filter and filter_map are quite useful for probability vectors (in the normalized version filter should maintain a sum of 1.0 of course...) Side note: @rleonid Great talk at compose on thursday! |
We might be talking about 2 different modules. I am proposing that
Some of the issues that this raises:
|
So one of my points was do you want probability to be (scaleValue * float array) where sum(float array)=1, or an unnormalized float array? Or just a normalized float array? What is the value of having it unnormalized by default? What types of algorithms will benefit from this? In order to sample, you would at least need a scale factor to avoid doing a full pass on it. As I noted before, one advantage to unnormalized is its easier to update the distribution when getting new observations. But if that's not the case here, it makes for a stronger case to normalize it. |
I'm imagining just a normalized
We would have an analogous type Lastly, we should add a "non-parametric" distribution data structure to accommodate polymorphic "pdf"'s. |
a nitpick: the weights vector is summed up and tested for summing to 1. if you're gonna sum it up anyway, why not allow arbitrary positive weights and normalize the weight vector?
oml/src/lib/stats/sampling.ml line 64
The text was updated successfully, but these errors were encountered: