Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC iteritems docstring update and examples #22658

Merged
merged 12 commits into from
Sep 27, 2018
24 changes: 24 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -728,11 +728,35 @@ def iteritems(self):
"""
Iterator over (column name, Series) pairs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this returns an Iterator (or more specifically a Generator). So I would leave this as Iterator or change this to Generator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought to make the change to 'Iterate over DataFrame ... as (..., Series) pairs to stay within the style of the iterrows and itertuples functions, but I can revert back to 'Iterator over (column name, Series) pairs.


Iterates over columns as key, value dict-like pairs with columns name as keys and Series as values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is a bit complex, and can give the impression that it's returning a dictionary.

Something like "Iterates over the DataFrame columns, returning a tuple with the label and the content as a Series" or something similar would be clearer IMHO.


Returns
-------
it : generator
A generator that iterates over the columns of the frame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of a genrator, instead of documenting that it returns a generator, we use the section Yields. In this case something like:

Yields
-------
label : object
    Description of label
content : Series
   Description of content


See also
--------
iterrows : Iterate over DataFrame rows as (index, Series) pairs.
itertuples : Iterate over DataFrame rows as namedtuples of the values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A in See Also should be capitalized.

Use DataFrame.iterrows instead of iterrows and same for itertuples.


Examples
--------
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
... index=['a', 'b'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use something that looks more like a real world example? I think it makes things easier to read.

See for example the example DataFrame in https://pandas-docs.github.io/pandas-docs-travis/generated/pandas.DataFrame.reset_index.html

>>> df
col1 col2
a 1 0.1
b 2 0.2
>>> for col in df.iteritems():
... print(col)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like next would make things easier to understand:

for label, content in df.iteritems():
    print('label:', label)
    print('content:' content)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the edits! I rephrased the long description, changed the Yields section to cover label and content, prepended DataFrame. to the See Also examples, and reworked the Examples section to hopefully be more engaging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you push the changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got my git commands mixed up sheepishly, but it's pushed now

...
('col1', a 1
b 2
Name: col1, dtype: int64)
('col2', a 0.1
b 0.2
Name: col2, dtype: float64)
"""
if self.columns.is_unique and hasattr(self, '_item_cache'):
for k in self.columns:
Expand Down