-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: remove the table keyword, replaced by fmt='s|t' #4645
Conversation
cc @michaelaye, cc @Meteore, cc @bluefir since you guys have given comments recently...any thoughts on this API change? |
Looks good to me! |
…es are ``s|t`` the same defaults as prior < 0.13.0 remain, e.g. ``put`` implies 's' (Storer) format and ``append`` imples 't' (Table) format
API: remove the table keyword, replaced by fmt='s|t'
Sorry for my silence, I was in Yellowstone completely offline! ;) I haven't used this functionality, what I am in general worried about is that knowledge of pytables becomes more and more a requirement for using pandas properly, at least for the hdf functionality. One could argue that data people need to deal with it in any case but I am loving pandas so much because it integrates many other python libraries seemlessly. In this case I wouldn't know what a 'storer' really is, apart from it's shown usage in the docs. Maybe my worries could be nullyfied by a helpful intro paragraph, unless that already exists and my 2 weeks absence made me miss it. |
http://pandas.pydata.org/pandas-docs/dev/io.html#storer-format if you can thing of a better name that storer let me know) you don't need knowledge of the internals just reading the docs for various formats that u can store lmk if this is still unclear |
Sorry, was offline as well. The new convention seems a bit more cryptic, but I have no other objection, and no code depending on the old one (but I am planning on using the HDF IO very soon). The name 'storer' could be substituted by something more indicative of what it is as opposed to 'table' (that also is a storer in a wide sense), though admittedly that may involve mentioning some pytables|hdf5-specific lingo. Not yet landed home, just a first impression from a quick glance. Grain of salt. |
maybe 'fixed'(format) vs 'table' ? |
Reading a bit further, I definitely agree with the API change due to my feeling that there is no natural preference between one or the other format, something that easily could be presumed using booleans as switch. |
I like your suggestion - going to go with format=fixed(f) | table(t) I'll changed all the storer refs to fixed I don't think should make additional to_hdf methods to much clutter ; and I think it makes sense to have a default of format=fixed (which is the equivalent of table=False) |
Really? I would have thought that most users expect the table to be append-able? Something along the credo of 'functionality before speed', so let the hardcore user that requires speed find out about the non-default setting? |
the reason fixed is the default is just back compat (HDFStore originally started with just a fixed type) what about an option setting eg io.hdf_format = fixed (but you can changed the default to table) then to_hdf will respect a passed format but default o the option setting? |
I like that! |
I also find the behavior of HDFStore confusing to understand. What happens |
tables are fundamentally different than fixed they can be appended and queried (via expression) see put vs append the default is for back compat they are two different storage back ends think hard disk vs tape (not a great analogy because fixed are much faster) PyTables supports many different types of storage formats (because HDF5 does) the impetuous for the format parameter in general is really to support a new table type at some point ctable - or column oriented tables the user has to select a backend at creation time and they each have fundamental different access patterns and perf characteristics and so can/should be used in diverse situations u basically pick the format depending in the problem |
fmt
keyword now replaces the table keyword; allowed values ares|t
the same defaults as prior < 0.13.0 remain, e.g.
put
implies 's' (Storer) formatand
append
imples 't' (Table) formatcloses #4584 as well