-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entropy calculation with non-probability vectors #769
Comments
What about adding a |
Yes we should probably check that the sum is one, with a |
I came accross this when calculating the entropy of single values (like entropy(8,2) = -24.0 Bit, nonsense). Currently my uncertainty model includes certainty and robust bounds like Union{Real, Interval, UnivariateDistribution}. For me a multiple dispatch fix works: A better way is
For probability vectors this changes nothing. At least when existing uses are consistent with documentation, this would be non-breaking. |
Ok it's not that easy after playing with the tests. |
I wouldn't normalize by default anyway as this could hide bugs (just like the current situation). Better require users to be explicit. |
So what would people think about isprobvec(p::AbstractVector{<:Real}) = all(x -> x ≥ zero(x), p) && isapprox(sum(p), one(eltype(p)))
function entropy(p::AbstractVector{<:Real}; check::Bool = true)
check && (isprobvec(p) || throw(ArgumentError("Not a proper probability distribution")))
return -sum(xlogx, p)
end
entropy(p, b::Real; check::Bool = true) = entropy(p; check) / log(b) If there's a thumbs up, I'd create a PR. This would yield
As you see I've taken |
I don't know whether the -0 here julia> entropy([0,1])
-0.0 is usually just ignored, but I guess it would be more proper if did instead of return copysign(sum(xlogx, p),0) As the sum(xlogx, p) has to be non-positive for a probability vector anyway, so it'll only affect the -0 case where 0 is returned instead? |
On the argument constraint julia> entropy([1,1+im]) # complex probabilities
0.43882457311747564 - 1.131971753677421im
julia> entropy(rand(2,2)) # matrix input
1.0956946888140768 Not even in quantum mechanics are complex probabilities a thing afaik, and I'd usually also not interpret a matrix as a probabilitiy distribution, but maybe people want to calculate the entropy of a joint probability mass function? So maybe relaxing it to |
There is no problem imho returning
For matrices it's common to interpret rows or cols as probabilities (e.g. as transition matrix for Markov Chains). There are definitions of row/right, col/left and doubly stochastic matrices, meaning rows or cols or both each sum up to one (= can be interpreted as probs).
Joint entropy, mutual information and so on should better be defined seperately. |
Complex probabilities may be a thing complex probability and information theory. Entropy extensions are in Sec 8 but I dont get it. |
I agree, the entropy calculation her easily returns not what people would expect if we allowed matrices, because as you say you may interpret the row as a probability vector, or all entries together. To avoid this, and force the user to be more explicit, I'd then suggest to constrain types here to vectors so that you'd either need to do |
The current (v0.33.16) implementation of
entropy(p)
StatsBase.jl/src/scalarstats.jl
Lines 760 to 768 in 071d10a
returns weird results if the argument isn't a probability-like vector (i.e.
sum(p) != 1
)The text was updated successfully, but these errors were encountered: