-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected results for the mean of a DataFrame of ufloat from the uncertainties package. #14162
Comments
this is very much like #13446 . Since pandas doesn't know that an Without a custom dtype, or special support baked into If someone wanted to contribute this functionaility then that would be great. Conceptually this is very easy, but there are lots of implementation details. |
@jreback Do I understand correctly that there is nothing that the uncertainties module can do to solve this issue? |
I have no idea |
A useful first step would be to see if you can reproduce the issue with |
@shoyer No issue with import pandas as pd
import numpy as np
from uncertainties import unumpy
value = np.arange(12).reshape(3,4)
err = 0.01 * np.arange(12).reshape(3,4) + 0.005
data = unumpy.uarray(value, err)
df = pd.DataFrame(data, index=['r1', 'r2', 'r3'], columns=['c1', 'c2', 'c3', 'c4'])
print (df.apply(lambda x: x.sum() / x.size).values), "\n"
print (data.mean(axis=0)), "\n"
print (df.T.apply(lambda x: x.sum() / x.size).values), "\n"
print (data.mean(axis=1)) |
@bgatessucks what is the type/dtype of |
|
And |
I just wanted to be sure that you're not using subclassing or something else like that. In any case, I think this is probably a pandas bug (but would need someone to work through/figure out). We should have a fallback implementation of |
@shoyer Sorry I had missed that:
|
For what it's worth, the same example as above works with a DataFrame initialized with a numpy array of
so I guess that pandas is able to correctly infer in this case that an array of |
Seen from the outside, it looks like in both cases Pandas decrees that the result of In contrast, as the original post shows, Pandas is more open on the data type of I think that it is desirable that since Pandas is able to |
Is there any news on this subject? Same problem here, with pandas version 1.0.1. |
I have the same issue with pandas version 1.0.3 |
Was this removed from the Someday milestone because it's more definitive than that now? I've just done a bunch of work to make |
We stopped using the "Someday" label entirely. I'm getting the same behavior on main as in the OP. Looks like the data is an object-type np.ndarray. As jreback said in 2016, this would need some special handling (probably in core.nanops). A PR would be welcome. Something like pint-pandas would probably be a better user experience than an object-dtype. |
@topper-123 this might be closed by your reduce_wrap PR? |
Sorry for the slow reply, I had a big project before going on a family vacation (which will last until the end of this week). but yes, #52788 will allow extension arrays like pint-pandas to use |
Closed by #52788 |
Related to #6898.
I find it very convenient to use a DataFrame of
ufloat
from theuncertainties
package. Each entry consists of (value, error) and could represent the result of Monte Carlo simulations or an experiment.At present taking sums along both axes gives the expected result, but taking the mean does not.
Examples:
The text was updated successfully, but these errors were encountered: