Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross off a few tslibs-TODOs #18443

Merged
merged 7 commits into from
Nov 25, 2017
Merged

Conversation

jbrockmendel
Copy link
Member

This is a hodge-podge addressing a few of the items in #17652. Individual commits should be item-specific(ish).

  • remove unused lib.arrmap
  • de-privatize _checknull_with_nat
  • fix a couple of C warnings caused by timedelta_struct
  • remove Period._comparables

pandas_datetime_metadata meta;

meta.base = fr;
meta.num - 1;
meta.num = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this typo. Do we even need the assignment at all though? Worth considering removal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm advocating moving the timedelta_struct stuff out of src/ entirely. The src/ directory is a PITA and we can put it in cython directly without taking a performance hit.

@WillAyd
Copy link
Member

WillAyd commented Nov 23, 2017

While you are touching this file I just noticed a typo in the below line:

https://github.com/jbrockmendel/pandas/blob/6ef9227fda1ee595eecf298e65ba99d6f9180db2/pandas/_libs/src/datetime/np_datetime.c#L1087

The word datetime should read timedelta (copy / paste issue). Figured it’s worth fixing while you are touching the file but if you wanted me to fix in a separate issue and PR let me know

@jbrockmendel
Copy link
Member Author

Good call. There’s also some indentation in timedeltastruct I’ll get to, could use a double-checking to make sure I don’t screw it up.

@codecov
Copy link

codecov bot commented Nov 23, 2017

Codecov Report

Merging #18443 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18443      +/-   ##
==========================================
- Coverage   91.35%   91.34%   -0.02%     
==========================================
  Files         163      163              
  Lines       49691    49691              
==========================================
- Hits        45397    45388       -9     
- Misses       4294     4303       +9
Flag Coverage Δ
#multiple 89.14% <ø> (ø) ⬆️
#single 39.67% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fedc503...c08e4ce. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 23, 2017

Codecov Report

Merging #18443 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18443      +/-   ##
==========================================
- Coverage   91.36%   91.31%   -0.05%     
==========================================
  Files         163      163              
  Lines       49700    49700              
==========================================
- Hits        45407    45386      -21     
- Misses       4293     4314      +21
Flag Coverage Δ
#multiple 89.12% <ø> (-0.03%) ⬇️
#single 39.66% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.44% <0%> (-1.82%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaee541...5dd4469. Read the comment docs.

@WillAyd
Copy link
Member

WillAyd commented Nov 23, 2017

ping me when complete and I’d be glad to double check

@jbrockmendel
Copy link
Member Author

@WillAyd ping. Should there be an extra indent after case PANDAS_FR_ns:? I'm not that familiar with C syntax.

If we do move the timedelta_struct stuff into cython, let's not bother with the conditioning on meta.base since only one is ever used.

@jreback jreback added Clean Datetime Datetime data dtype labels Nov 23, 2017
@@ -1016,11 +1012,11 @@ int convert_timedelta_to_timedeltastruct(pandas_timedelta_metadata *meta,

// put frac in seconds
if (td < 0 && td % (1000LL * 1000LL * 1000LL) != 0)
frac = td / (1000LL * 1000LL * 1000LL) - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already lint this file (cpplint). is this not picked up? rather have the proper config settings than manually changing things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No argument here, but cpplint is well outside my domain of competence. I only noticed this indentation when trying to move this func into cython.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd any experience with cpplint? want to give this a shot, e.g. making sure we are linting the *.c (more/better)?

Copy link
Member

@WillAyd WillAyd Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't claim experience with cpplint but looking at the bash script it appears to already hit all of the files in the datetime folder. I ran both versions of this file through cpplint locally and neither caused any issue with the highest level of verbosity. Surprisingly I couldn't get an error when changing the line to a 1 or 3 space indent either, so I think there might be some bug with cpplint itself.

In looking at Google's style guide they suggest a 2 space indent. To match that and keep consistent with the rest of the file I would suggest removing the two spaces added here and remove the two from the indented block after the else statement instead

https://google.github.io/styleguide/cppguide.html#Conditionals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/apache/arrow?files=1

for some hints in how to lint / enforce this as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that within arrow cpplint is calling out those whitespace issues? I tried replicating the filters I saw in that project via:

cpplint --verbose=2 --filter=-whitespace/comments,-readability/todo,-build/header_guard,-build/c++11,-runtime/references,-build/include_order pandas/_libs/src/datetime/np_datetime.c

And then tried an even more verbose version without any filter

cpplint --verbose=1 pandas/_libs/src/datetime/np_datetime.c

And neither made any mention of those lines, whether I set them at 2, 4 or even 3 space indents.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no just a guess that they have more options set for linting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the cpplint source, I don't see anything that checks for proper 2-space indentation of conditional statements, nor was there any documentation hinting that it could. The whitespace/indentat check offered only seems to make sure that you don't start with 1 or 3 spaces, but doesn't do much beyond that. Unfortunately I don't see any programmatic options to continually validate this with that tool as is

@@ -409,7 +409,7 @@ cpdef array_with_unit_to_datetime(ndarray values, unit, errors='coerce'):
for i in range(n):
val = values[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these array_* should be moved to conversion I think (obviously new PR), then tslib.pyx is moot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's hold off on that. I'd like to do auditing before moving, and the array_* funcs may need some work/thought (#17697)

@jbrockmendel
Copy link
Member Author

Travis error in test_geopandas

>   from fiona.ogrext import Iterator, ItemsIterator, KeysIterator
E   ImportError: libpoppler.so.66: cannot open shared object file: No such file or directory
../../../miniconda3/envs/pandas/lib/python3.6/site-packages/fiona/collection.py:9: ImportError

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

no idea, maybe something is in process of being updated

@WillAyd
Copy link
Member

WillAyd commented Nov 23, 2017

Referring to the Google style guide again we should have a 2 space indent after the case PANDAS_FR_ns:.

As I mentioned on a previous comment, I have a feeling that there's either a bug or missing configuration for cpplint to detect those whitespace issues. I'm not overly familiar with the tool so need to research further and advise back.

@jbrockmendel
Copy link
Member Author

@WillAyd I'm way out of my element w/r/t C linting, will have to trust you that the 2-spaces is correct. Are my eye playing tricks on me when the indentation looks inconsistent? i.e. should I just revert all the indentation edits?

Isn't the switch/case usage here unnecessary anyway? Or are there plans to implement and use other units?

@WillAyd
Copy link
Member

WillAyd commented Nov 24, 2017

You are correct that the switch/case usage is at the moment unnecessary, because the method is only being called with nanosecond frequency. The reason I kept it in there was to mirror the implementation that was done for datetime.

As far as the indentation is concerned, that's up to @jreback. I would suggest going forward with your 4 space indentation because it's consistent with the rest of the file. From what I can tell cpplint isn't going to care either way.

@jbrockmendel
Copy link
Member Author

The reason I kept it in there was to mirror the implementation that was done for datetime.

That makes sense. In the interest of keeping this already-too-wide-ranging PR from expanding scope, I'd like to leave it as is for now, i.e. use 4-space consistently and keep the unnecessary switch/case. Later I think we should implement the pandas_timedeltastruct and td64_to_tdstruct in tslibs.np_timedeltas directly in cython. If there isn't a consensus on the latter, achieving it on the former can be good enough for now.

@jreback jreback added this to the 0.22.0 milestone Nov 25, 2017
@jreback jreback merged commit be66ef8 into pandas-dev:master Nov 25, 2017
@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

thanks @jbrockmendel as always, keeping scope narrow is better per PR

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

I am not ticking off boxes in #17652 but pls do (and link to the issues)

@jbrockmendel jbrockmendel mentioned this pull request Nov 25, 2017
59 tasks
@jbrockmendel jbrockmendel deleted the tslibs-todos1 branch December 8, 2017 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants