-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add gzip/bz2 compression to relevant read_* methods #15644
Comments
most important is json |
Stata is not compressed but is just a fairly plan binary file format. This said, I don' t think there is much of a reason to add compression methods since the output file wouldn't be usable in Stata (presumable the reason to output in this format) without manual decompression. |
IIRC |
In this case, it looks like |
@gfairchild want to take a stab at this? should be fairly straightforward as you can pretty much reuse the infrastructure (mainly just passing the compression arg thru). This is really just a couple of tests as well. |
I'd be happy to. Just got to find the time. Maybe I can do it this weekend. |
This issue is a branch off of #11666, which implemented compression support for
read_pickle
. There are still a few otherread_*
methods that could possibly benefit from compression support. Looking at the I/O API reference, this jump out at me:read_json
- This can definitely benefit from compression. I've stored very large gzipped JSON files before. As a general rule, anyread_*
method that supports any kind of plaintext format should support compression.read_stata
- I don't use Stata, but it looks like a .dta file is not a plaintext file. Is it naturally compressed, or can they be compressed significantly like pickles?read_sas
- I've also never used SAS, and like Stata's .dta files, it looks like .xpt and .sas7bdat files are both some binary format. Can they be compressed well?The text was updated successfully, but these errors were encountered: