-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COMPAT: reading json with lines=True from s3, xref #17200 #17201
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
8255cd0
Fix for #17200
caa3c80
Revert "Fix for #17200"
4709251
Wrap BytesIO based streams when using
a042e3c
Added and deps for mocking s3 testing calls.
0c164e7
Removed boto3 per code review
ac99133
Skip if imports don't exist. Create fixture for test setup.
125f049
compatibility for Python 2.7
533d404
Add object handling in the read_json function instead of handling in…
c7f13b8
Merge branch 'master' into patch-1
17973a1
add s3fs to requirements_all. Will be needed to run read_json tests a…
2ae5a9d
PEP-8 fixes. Ignore tests without s3fs available. Use ensure_clean
38f043b
PEP-8 fixes. Ignore tests without s3fs available. Use ensure_clean
1d03d7a
remove bad dev deps. fix tempfile context in tests
78ee720
Fixing link errors
9d7e75b
Code review formatting and compliance. Also use for testing
b21401b
Merge remote-tracking branch 'upstream/master' into alph486-patch-1
TomAugspurger 6979fb8
REF: Move s3 mocking to conftest
TomAugspurger bb600cb
Update requirements-2.7.run
TomAugspurger 5b036ec
Update requirements-2.7_SLOW.run
TomAugspurger c5c4d07
End in newlines
TomAugspurger f1122ca
Merge branch 'master' into PR_TOOL_MERGE_PR_17201
jreback b913972
linting & fix
jreback File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
import os | ||
|
||
import moto | ||
import pytest | ||
from pandas.io.parsers import read_table | ||
|
||
HERE = os.path.dirname(__file__) | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def tips_file(): | ||
"""Path to the tips dataset""" | ||
return os.path.join(HERE, 'parser', 'data', 'tips.csv') | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def jsonl_file(): | ||
"""Path a JSONL dataset""" | ||
return os.path.join(HERE, 'parser', 'data', 'items.jsonl') | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def salaries_table(): | ||
"""DataFrame with the salaries dataset""" | ||
path = os.path.join(HERE, 'parser', 'data', 'salaries.csv') | ||
return read_table(path) | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def s3_resource(tips_file, jsonl_file): | ||
"""Fixture for mocking S3 interaction. | ||
|
||
The primary bucket name is "pandas-test". The following datasets | ||
are loaded. | ||
|
||
- tips.csv | ||
- tips.csv.gz | ||
- tips.csv.bz2 | ||
- items.jsonl | ||
|
||
A private bucket "cant_get_it" is also created. The boto3 s3 resource | ||
is yielded by the fixture. | ||
""" | ||
pytest.importorskip('s3fs') | ||
moto.mock_s3().start() | ||
|
||
test_s3_files = [ | ||
('tips.csv', tips_file), | ||
('tips.csv.gz', tips_file + '.gz'), | ||
('tips.csv.bz2', tips_file + '.bz2'), | ||
('items.jsonl', jsonl_file), | ||
] | ||
|
||
def add_tips_files(bucket_name): | ||
for s3_key, file_name in test_s3_files: | ||
with open(file_name, 'rb') as f: | ||
conn.Bucket(bucket_name).put_object( | ||
Key=s3_key, | ||
Body=f) | ||
|
||
boto3 = pytest.importorskip('boto3') | ||
# see gh-16135 | ||
bucket = 'pandas-test' | ||
|
||
conn = boto3.resource("s3", region_name="us-east-1") | ||
conn.create_bucket(Bucket=bucket) | ||
add_tips_files(bucket) | ||
|
||
conn.create_bucket(Bucket='cant_get_it', ACL='private') | ||
add_tips_files('cant_get_it') | ||
|
||
yield conn | ||
|
||
moto.mock_s3().stop() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
{"a": 1, "b": 2} | ||
{"b":2, "a" :1} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,62 +4,14 @@ | |
Tests parsers ability to read and parse non-local files | ||
and hence require a network connection to be read. | ||
""" | ||
import os | ||
|
||
import pytest | ||
import moto | ||
|
||
import pandas.util.testing as tm | ||
from pandas import DataFrame | ||
from pandas.io.parsers import read_csv, read_table | ||
from pandas.compat import BytesIO | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def tips_file(): | ||
return os.path.join(tm.get_data_path(), 'tips.csv') | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cool |
||
|
||
@pytest.fixture(scope='module') | ||
def salaries_table(): | ||
path = os.path.join(tm.get_data_path(), 'salaries.csv') | ||
return read_table(path) | ||
|
||
|
||
@pytest.fixture(scope='module') | ||
def s3_resource(tips_file): | ||
pytest.importorskip('s3fs') | ||
moto.mock_s3().start() | ||
|
||
test_s3_files = [ | ||
('tips.csv', tips_file), | ||
('tips.csv.gz', tips_file + '.gz'), | ||
('tips.csv.bz2', tips_file + '.bz2'), | ||
] | ||
|
||
def add_tips_files(bucket_name): | ||
for s3_key, file_name in test_s3_files: | ||
with open(file_name, 'rb') as f: | ||
conn.Bucket(bucket_name).put_object( | ||
Key=s3_key, | ||
Body=f) | ||
|
||
boto3 = pytest.importorskip('boto3') | ||
# see gh-16135 | ||
bucket = 'pandas-test' | ||
|
||
conn = boto3.resource("s3", region_name="us-east-1") | ||
conn.create_bucket(Bucket=bucket) | ||
add_tips_files(bucket) | ||
|
||
conn.create_bucket(Bucket='cant_get_it', ACL='private') | ||
add_tips_files('cant_get_it') | ||
|
||
yield conn | ||
|
||
moto.mock_s3().stop() | ||
|
||
|
||
@pytest.mark.network | ||
@pytest.mark.parametrize( | ||
"compression,extension", | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the purpose of this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, ok you have to have this named
.json
otherwise it won't be picked up bysetup.py
(IOW the install test will fail).