Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Set index when reading stata file #17328

Merged
merged 1 commit into from
Sep 16, 2017

Conversation

bashtage
Copy link
Contributor

Ensures index is set when requested when reading state dta file

closes #16342

@gfyoung gfyoung added Bug IO Stata read_stata, to_stata labels Aug 24, 2017
@@ -1486,6 +1486,8 @@ def read(self, nrows=None, convert_dates=None,
columns = self._columns
if order_categoricals is None:
order_categoricals = self._order_categoricals
if index is None:
index = self._index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests that hit this path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in PR

@codecov
Copy link

codecov bot commented Aug 24, 2017

Codecov Report

Merging #17328 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17328      +/-   ##
==========================================
+ Coverage   91.01%   91.02%   +<.01%     
==========================================
  Files         162      162              
  Lines       49567    49571       +4     
==========================================
+ Hits        45113    45120       +7     
+ Misses       4454     4451       -3
Flag Coverage Δ
#multiple 88.8% <100%> (+0.02%) ⬆️
#single 40.24% <0%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/stata.py 93.69% <100%> (+0.02%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.72% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.05% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96f92eb...b8e36ac. Read the comment docs.

@codecov
Copy link

codecov bot commented Aug 24, 2017

Codecov Report

Merging #17328 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17328      +/-   ##
==========================================
+ Coverage   91.01%   91.02%   +<.01%     
==========================================
  Files         162      162              
  Lines       49567    49571       +4     
==========================================
+ Hits        45113    45120       +7     
+ Misses       4454     4451       -3
Flag Coverage Δ
#multiple 88.8% <100%> (+0.02%) ⬆️
#single 40.24% <0%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/stata.py 93.69% <100%> (+0.02%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.72% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.05% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96f92eb...b8e36ac. Read the comment docs.

@codecov
Copy link

codecov bot commented Aug 24, 2017

Codecov Report

Merging #17328 into master will decrease coverage by 0.01%.
The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17328      +/-   ##
==========================================
- Coverage   91.01%   90.99%   -0.02%     
==========================================
  Files         163      163              
  Lines       49567    49575       +8     
==========================================
- Hits        45113    45112       -1     
- Misses       4454     4463       +9
Flag Coverage Δ
#multiple 88.78% <95.83%> (ø) ⬆️
#single 40.26% <62.5%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/stata.py 93.71% <95.83%> (+0.04%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.72% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d676a3...4e30df8. Read the comment docs.

@TomAugspurger
Copy link
Contributor

Did we want to change the keyword from index to index_col to match read_csv?

@bashtage
Copy link
Contributor Author

Probably a good idea.

@bashtage
Copy link
Contributor Author

Does this need deprecation?

@bashtage
Copy link
Contributor Author

Last one, is there a pattern to follow for renaming (e.g. a decorator, *kwargs, etc)

@TomAugspurger
Copy link
Contributor

Yeah, there's an @deprecate_kwarg decorator somewhere. Honestly not sure about whether it needs a deprecation, since the old one didn't do anything, but we may as well.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 25, 2017

It's in pandas/util/_decorators.py btw.

@bashtage
Copy link
Contributor Author

I went ahead and did the insta-deprecate.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments. can you deprecate index (and direct to index_col) as well.

@@ -293,6 +293,7 @@ Other API Changes
- :func:`Series.argmin` and :func:`Series.argmax` will now raise a ``TypeError`` when used with ``object`` dtypes, instead of a ``ValueError`` (:issue:`13595`)
- :class:`Period` is now immutable, and will now raise an ``AttributeError`` when a user tries to assign a new value to the ``ordinal`` or ``freq`` attributes (:issue:`17116`).
- :func:`to_datetime` when passed a tz-aware ``origin=`` kwarg will now raise a more informative ``ValueError`` rather than a ``TypeError`` (:issue:`16842`)
- Renamed non-functional `index` to `index_col` in :func:`read_stata` to improve API consistency (:issue:`16342`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use double backticks here (around index,index_col)

@@ -1309,3 +1309,11 @@ def test_value_labels_iterator(self, write_index):
dta_iter = pd.read_stata(path, iterator=True)
value_labels = dta_iter.value_labels()
assert value_labels == {'A': {0: 'A', 1: 'B', 2: 'C', 3: 'E'}}

def test_set_index(self):
df = tm.makeDataFrame()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number

Ensures index is set when requested during reading of a Stata dta file
Deprecates and renames index to index_col for API consistence

closes pandas-dev#16342
@TomAugspurger
Copy link
Contributor

minor comments. can you deprecate index (and direct to index_col) as well.

Do we need a deprecation, or can we just "break" API here. index=... didn't work at all earlier, so I'm OK with just changing without a deprecation cycle.

@TomAugspurger
Copy link
Contributor

Ahh I see 4e30df8 now so nevermind, that's fine.

@bashtage
Copy link
Contributor Author

@jreback @TomAugspurger I think this is GTG

@TomAugspurger TomAugspurger merged commit f5cfdbb into pandas-dev:master Sep 16, 2017
@TomAugspurger
Copy link
Contributor

Thanks!

alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
Ensures index is set when requested during reading of a Stata dta file
Deprecates and renames index to index_col for API consistence

closes pandas-dev#16342
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
Ensures index is set when requested during reading of a Stata dta file
Deprecates and renames index to index_col for API consistence

closes pandas-dev#16342
@bashtage bashtage deleted the stats-index-fix branch April 22, 2018 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Stata read_stata, to_stata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_stata ignores index parameter
4 participants