Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Warn about panel.to_frame() discarding NaN GH7879 #7970

Closed

Conversation

m-novikov
Copy link

Also there was unimported module warnings with calls to it. In pandas/core/panel.py lines 718, 748 for example.

closes #7879

@@ -858,6 +859,10 @@ def to_frame(self, filter_observations=True):
mask = com.notnull(self.values).all(axis=0)
# size = mask.sum()
selector = mask.ravel()
if not all(selector):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use not selector.all(). there's a lot of overhead with the python builtin all for ndarrays because the builtin all is very generic and works on anything with a next method. For an ndarray of size 10,000,000 ndarray.all() is about 60x faster:

In [1]: n = 1e7

In [2]: x = randn(n)

In [3]: all(x)
Out[3]: True

In [4]: x.all()
Out[4]: True

In [5]: timeit all(x)
1 loops, best of 3: 658 ms per loop

In [6]: timeit x.all()
100 loops, best of 3: 11.3 ms per loop

In [7]: 658/11.3
Out[7]: 58.23008849557522

@cpcloud
Copy link
Member

cpcloud commented Aug 9, 2014

@m-novikov couple of minor comments here, otherwise looks good

@m-novikov
Copy link
Author

Fixed according to your comments, also added suppressing warning statement to pandas/io/tests/test_pytables.py as it's shows up on test run.

result = wp.to_frame()

with tm.assert_produces_warning(UserWarning):
setattr(panelm, '__warningregistry__', {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this setattr line do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resets warnings in case it's already been raised somewhere. So I can for sure catch this warning.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

if u take those extra lines out ( the filtering in setup and the setattr) what happens? I am puzzled why they are in here in the first place

that is the point of assert_produces_warning it does the filtering necessary to assert the warning

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

more to the point is the rationale behind this pr
you are warning on a valid option

that seems a little odd no ?

@cpcloud
Copy link
Member

cpcloud commented Aug 9, 2014

i think this warning is ok, it's a bit of a surprise to throw away data by default.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

why not just change the default then?

@cpcloud
Copy link
Member

cpcloud commented Aug 9, 2014

well that would be nice but would break existing code

@m-novikov
Copy link
Author

If I lose
warnings.filterwarnings(action='ignore', category=UserWarning)
then during test run there will be this warning raised.
If I lose
setattr then I will be dependent on order in which tests are run, warning raised only once by default, and if it was raised in some other test module I will not catch it.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

I say just a change the default as an API change

or do nothing

@cpcloud
Copy link
Member

cpcloud commented Aug 9, 2014

i'd be ok with that, maybe a nice warning in the docs that says "you have to call dropna now"

@m-novikov
Copy link
Author

Removed warnings suppression

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

@m-novikov if u change the result does anything break in the test suite (aside from the actual test for this)

@m-novikov
Copy link
Author

@jreback if I leave it with warnings nothing breaks, just occasional message in some test will be displayed and so without it but as was stated it's not expected behaviour.
setattr(panelm, '__warningregistry__', {}) breaks nothing, but warnings of this module may be displayed twice during test suite run. Possibly I should store state of registry and restore it after test.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2014

no I mean if u change the default to False
instead if warning

@m-novikov
Copy link
Author

@jreback There is SparsePanel.to_frame() which doesn't support filter_observations=False, if change signature only for Panel class then it will be inconsistent.
Also it breaks a few methods like Panel.count for example.

@jreback jreback added API Design Reshaping Concat, Merge/Join, Stack/Unstack, Explode Prio-high Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 25, 2015
@jreback jreback added this to the 0.16.0 milestone Jan 25, 2015
@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

@m-novikov can you put in a short doc note. Also Let's change this to a FutureWarning. Then change the default in 0.17.0

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@jreback jreback modified the milestones: Next Major Release, 0.16.0 Mar 3, 2015
@jreback jreback modified the milestones: 0.17.0, Next Major Release Apr 8, 2015
@m-novikov m-novikov force-pushed the warn-about-to_frame-filtering branch from 6ba5539 to 7d7a805 Compare April 18, 2015 06:32
@m-novikov
Copy link
Author

@jreback updated

@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

I would like to repurpose this and simply change the default here (no need to warn then). Can you update?

@m-novikov
Copy link
Author

Yes, so this means all related test to be fixed too, I will get to it

@jreback
Copy link
Contributor

jreback commented Jul 6, 2015

@m-novikov gr8! lmk

@jreback
Copy link
Contributor

jreback commented Aug 15, 2015

I think we can simply change this for 0.17.0.

going to close this, but pls reopen if you want address that.

@jreback jreback closed this Aug 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panel.to_frame() discards nan by default
3 participants