Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Relax is-file-like conditions #16150

Merged
merged 1 commit into from
Apr 27, 2017

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Apr 26, 2017

Previously, we were requiring that all file-like objects had "read," "write," "seek," and "tell" methods,
but that was too strict (e.g. read-only buffers). This commit relaxes those requirements to having EITHER "read" or "write" as attributes.

Closes #16135.

@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.83%   90.84%   +<.01%     
==========================================
  Files         159      159              
  Lines       50796    50794       -2     
==========================================
- Hits        46143    46142       -1     
+ Misses       4653     4652       -1
Flag Coverage Δ
#multiple 88.62% <100%> (ø) ⬆️
#single 40.31% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/common.py 91.03% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b80ed3...57bc69d. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.83%   90.84%   +<.01%     
==========================================
  Files         159      159              
  Lines       50796    50794       -2     
==========================================
- Hits        46143    46142       -1     
+ Misses       4653     4652       -1
Flag Coverage Δ
#multiple 88.62% <100%> (ø) ⬆️
#single 40.31% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/common.py 91.03% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b80ed3...57bc69d. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 26, 2017

Codecov Report

Merging #16150 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16150      +/-   ##
==========================================
+ Coverage   90.85%   90.88%   +0.02%     
==========================================
  Files         159      159              
  Lines       50785    50783       -2     
==========================================
+ Hits        46141    46152      +11     
+ Misses       4644     4631      -13
Flag Coverage Δ
#multiple 88.66% <100%> (+0.02%) ⬆️
#single 40.32% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (-0.06%) ⬇️
pandas/core/indexes/datetimes.py 95.43% <0%> (+0.09%) ⬆️
pandas/plotting/_converter.py 65.35% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d9909c...d2efe18. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

my_s3_object = boto3.client('s3').get_object(Bucket=<S3 bucket>, Key=<S3 path>) pd.read_csv(my_s3_object['Body'])

should work with one of the s3 buckets (.e.g. s3://pandas-test/tips.csv), can you add a test as well (should fail prior to this PR, and work with).

@jreback jreback added the IO CSV read_csv, to_csv label Apr 27, 2017
@jreback jreback added this to the 0.20.0 milestone Apr 27, 2017
@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

@jreback : Just to make sure I got it right, it's:

Bucket = "pandas-test"
Key = "pandas-test/tips.csv" (or is it just "/tips.csv", don't remember)

Also, do we install boto3 during testing? I don't see it explicitly in the codebase.

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

we already have s3 tests just not streaming ones.

(pandas) bash-3.2$ grep -R s3 pandas/tests/io/parser/
Binary file pandas/tests/io/parser//__pycache__/test_network.cpython-36-PYTEST.pyc matches
pandas/tests/io/parser//test_network.py:            import s3fs  # noqa
pandas/tests/io/parser//test_network.py:            pytest.skip("s3fs not installed")
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' +
pandas/tests/io/parser//test_network.py:        df = read_csv('s3://cant_get_it/tips.csv')
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3n_bucket(self):
pandas/tests/io/parser//test_network.py:        # Read from AWS s3 as "s3n" URL
pandas/tests/io/parser//test_network.py:        df = read_csv('s3n://pandas-test/tips.csv', nrows=10)
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3a_bucket(self):
pandas/tests/io/parser//test_network.py:        # Read from AWS s3 as "s3a" URL
pandas/tests/io/parser//test_network.py:        df = read_csv('s3a://pandas-test/tips.csv', nrows=10)
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_nrows(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' +
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_chunked(self):
pandas/tests/io/parser//test_network.py:            df_reader = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_chunked_python(self):
pandas/tests/io/parser//test_network.py:            df_reader = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_python(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext, engine='python',
pandas/tests/io/parser//test_network.py:    def test_infer_s3_compression(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext,
pandas/tests/io/parser//test_network.py:    def test_parse_public_s3_bucket_nrows_python(self):
pandas/tests/io/parser//test_network.py:            df = read_csv('s3://pandas-test/tips.csv' + ext, engine='python',
pandas/tests/io/parser//test_network.py:    def test_s3_fails(self):
pandas/tests/io/parser//test_network.py:            read_csv('s3://nyqpug/asdf.csv')
pandas/tests/io/parser//test_network.py:            read_csv('s3://cant_get_it/')

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

you can import boto3 its a dependency of s3fs which makes a nicer layer on top.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

I was asking about the proper syntax for calling get_object (i.e. what are the bucket and key names exactly just to make sure).

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

bucket is pandas-test, path is \tip.csv I think.

@gfyoung
Copy link
Member Author

gfyoung commented Apr 27, 2017

@jreback : boto test has been added, and everything is green.

Previously, we were requiring that
all file-like objects had "read,"
"write," "seek," and "tell" methods,
but that was too strict (e.g. read-only
buffers). This commit relaxes those
requirements to having EITHER "read"
or "write" as attributes.

Closes pandas-devgh-16135.
# see gh-16135

# boto3 is a dependency of s3fs
import boto3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a skip if s3fs is not installed?

Copy link
Member Author

@gfyoung gfyoung Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. There's a setUp method in this test class which skips on that exact condition.

@jreback jreback merged commit a16fc8d into pandas-dev:master Apr 27, 2017
@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

ok then, thanks!

@gfyoung gfyoung deleted the is-file-like-relax branch April 27, 2017 13:41
pcluo pushed a commit to pcluo/pandas that referenced this pull request May 22, 2017
Previously, we were requiring that
all file-like objects had "read,"
"write," "seek," and "tell" methods,
but that was too strict (e.g. read-only
buffers). This commit relaxes those
requirements to having EITHER "read"
or "write" as attributes.

Closes pandas-devgh-16135.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants