Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RLS: 0.24.0 #24060

Closed
jreback opened this issue Dec 3, 2018 · 61 comments
Closed

RLS: 0.24.0 #24060

jreback opened this issue Dec 3, 2018 · 61 comments
Labels
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Dec 3, 2018

Tracking issue

Open PRs
Open Issues

Let's whittle down. I just looked the lastest whatsnew and its huge. Let's get this out sooner rather than later. I know there are some blocking issues: DatetimeArray & What do do with CalendarDay.

@jreback jreback added the Release label Dec 3, 2018
@jreback jreback added this to the 0.24.0 milestone Dec 3, 2018
@jbrockmendel
Copy link
Member

The idea is for 0.24.0 to be the last py2-supporting release, right?

#24021 fixes a corner case behavior in Timestamp comparisons, but also introduces an inconsistency between py2/py3 behavior. #21394 made the analogous change to Timedelta comparisons.

The least consistent thing we could do is to keep the status quo, changing Timedelta behavior but not Timestamp behavior. The question is whether to a) merge #24021 and have mismatched py2/py3 behavior in 0.24.0, or b) revert #21394 until after 0.24.0 and wait to change both in the first py3-only release.

I lean slightly towards option b.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Dec 4, 2018

The idea is for 0.24.0 to be the last py2-supporting release, right?

Basically, though I assume we'll do some backporting and do a 0.24.1 or 2 that supports py2 as well.

I'm not super-familiar with this section, but your option b sounds reasonable.

Although... I could live with inconsistent py2 / py3 behavior. And would we be consistent with the building That wouldn't be the only one.

Question: with
#24021 would we be consistent with the builtin timedelta of each version?

@h-vetinari
Copy link
Contributor

h-vetinari commented Dec 6, 2018

This release is indeed big, but v.0.24 is also quite special because it will effectively define the 1.0 API (in the sense of the no-deprecation policy between 0.24 and 1.0), and also because of the magnitude of the whole EA effort obviously.

But - despite a lot of hard work - the current state still feels a bit half-baked:

Realistically, a release before the end of the year would mean a cut-off in ~10 days max, which seems unrealistic from my POV, even if cutting corners.

Given that the following statement from @TomAugspurger above

Basically, though I assume we'll do some backporting and do a 0.24.1 or 2 that supports py2 as well.

effectively means more PY2 support at the beginning of 2019 anyway, I think one should consider not trying to force the release before year end.

If there is a release before the end of the year (resp. before the most important issues are sorted out), then I especially agree with Tom that there will need to be a 0.24.1 for PY2, as 0.24.0 will have too many issues (that hopefully surface in the RC, but well...) to be the last release, IMO.

Alternatively (which is simultaneously more in line with https://python3statement.org/, but also more controversial), one could consider to have a last PY2-supporting 0.23.5 this year, and then do 0.24.0 as PY3-only next year...?

@jreback
Copy link
Contributor Author

jreback commented Dec 6, 2018

@h-vetinari pandas is a volunteer project almost completely. Thus project priorities are set by community consensus and worked towards. We have regular releases in time; 0.24.0 is actually overdue by a few months. Trying to add additional things which themselves need discussion is not going to happen.

Whatever exists in he 0.24.x series is the last release for Python 2, this has long been announced. This is just how it is.

@TomAugspurger
Copy link
Contributor

I don't follow the point of
#24060 (comment). Could you try restating / summarizing it?

I think Py2 vs. Py3 is irrelevant for 0.24.0.

The EA-releated issues you linked are I think all going in 0.24 (I didn't check them all). That's basically the blocker at this point, but I haven't reviewed the backlog recently.

I haven't had time to look into unique.

@h-vetinari
Copy link
Contributor

@jreback

@h-vetinari pandas is a volunteer project almost completely. Thus project priorities are set by community consensus and worked towards. We have regular releases in time; 0.24.0 is actually overdue by a few months. Trying to add additional things which themselves need discussion is not going to happen.

I know all that. I'm just saying that being overdue is a bad reason to rush a release, if some of the core changes have not been fully developed yet (talking mostly about EA + regressions).

Whatever exists in he 0.24.x series is the last release for Python 2, this has long been announced. This is just how it is.

I don't know what's been discussed in other channels, but what I saw on GH was that the main decision was 0.24->0.25->1.0. Re:PY2, it has equally been said (and there's a warning to that effect on the whatsnew's) that there won't be PY2 releases after Dec. 31st 2018. Supporting the 0.24.0 series for PY2 is another ~6-8 month of supporting PY2 (as backporting to 0.24-branch would otherwise be very cumbersome). Of course that's a valid choice, but I just wanted to suggest the possibility of leaving PY2 at the very stable 0.23.5 instead.

@h-vetinari
Copy link
Contributor

@TomAugspurger

I don't follow the point of
#24060 (comment). Could you try restating / summarizing it?
I think Py2 vs. Py3 is irrelevant for 0.24.0.

Sorry this was not clear. The main (interrelated) two points are:

  • one should not rush 0.24 due to the stated deadline (for supporting PY2) of Dec 31st.
    • listed some issues for that - some of these (and many more related ones I didn't tag) have been pushed to "Contributions Welcome" instead of v.0.24
    • brought up that 0.24 is special due to the API-lock policy until 1.0, and that I'd think this would be a pity for .unique (but I get that I'm a lone voice in the wilderness about that one)
  • point out the discrepancy between the warning box ("Starting January 1, 2019, pandas feature releases will support Python 3 only") and supporting the 0.24-series for PY2, along with the suggestion to have a 0.23.5 for PY2

The EA-releated issues you linked are I think all going in 0.24 (I didn't check them all). That's basically the blocker at this point, but I haven't reviewed the backlog recently.

The backlog has been thinned substantially recently, not least by issues being pushed to "Contributions Welcome" (also for EA issues). This goes towards my point of cutting corners to get this out soon.

That being said, I'm pretty new at this game, and I don't doubt that the core devs got the big picture in view and under control - but pointing out an observation can't hurt, I hope.

I haven't had time to look into unique.

Fair enough, dev time is a very limited resoure. I guess I'll have to try to convince everyone of this in post-1.0 land. ;-)

@jreback
Copy link
Contributor Author

jreback commented Dec 9, 2018

@h-vetinari

I know all that. I'm just saying that being overdue is a bad reason to rush a release, if some of the core changes have not been fully developed yet (talking mostly about EA + regressions).

If we keep adding and adding, then the release just keeps getting delayed forever. I have drawn a line in sand. This is how you get product out the door. Once the DTA lands fully, we will be in a position to release. So this is not that far away. Sure we could do extra work and just say 0.23.5 is the last PY2 release (and of course release it). But its going to be easier to back to a stable branch, which means the 0.24.x series.

There are always things to add in a release, but this one is already the biggest we have ever had. There are inevitable bugs and so better to do sooner rather than later. Thanks for your contributions. It is not possible to get every major API change here. You are exactly right in that dev time is a truly limited resource.

@h-vetinari
Copy link
Contributor

@jreback
Thanks for the response. I understand you want to get this out the door ASAP, which is fair enough, of course.

But its going to be easier to back to a stable branch, which means the 0.24.x series.

Seems I misunderstood that pandas would stop supporting PY2 per Jan 1st 2019... Maybe should adapt that warning box in the whatsnews then ("v.0.24.x will be the last series that supports PY2; starting with v.0.25.0, pandas will be python3-only"...?)

@mroeschke
Copy link
Member

RE: CalendarDay progress #22867 (comment)

TODO
Add all the depreciation warnings (I anticipate there being quite a few of these). This needs to be included in the following:

  • Day tick arithmetic with other Ticks, Timedeltas, and DatetimeTZ
  • DatetimeIndex.shift (tz-aware only)

Migration
The plan is for _Day (formerly CalendarDay) to just replace Day once the prior Day behavior is replaced.

Concern
During the last dev chat, there was interest for 'D' to be compatible freq argument/offset with both Timedelta and Datetime. I don't see a clear way to make this possible without adding a lot of monkeypatching.

Example: timedelta_range(..., freq='D'); to_offset('D') will return _Day in the future and this offset will need to increment a Timedelta, but _Day + Timedelta is an invalid operation.

@jbrockmendel
Copy link
Member

Anyone have opinions on the timestamp/timedelta py2/py3 consistency issue?

@jbrockmendel
Copy link
Member

A bunch of deprecations are listed as To Be Removed in 1.0; should 0.25.0 take the place of 1.0 for some of those?

@TomAugspurger
Copy link
Contributor

Anyone have opinions on the timestamp/timedelta py2/py3 consistency issue?

Can you summarize that issue? Ideally we would just follow python (whatever version is running) here I think. But I don't think I fully understand the issue.

A bunch of deprecations are listed as To Be Removed in 1.0; should 0.25.0 take the place of 1.0 for some of those?

I think all of them... Need to discuss that though, some may need be pushed.

@jbrockmendel
Copy link
Member

#24060 (comment)

Timedelta was recently changed to return NotInplemented in case where it previously raised. As a result its py2 behavior matches python but differs from pandas py3 behavior.

Timestamp has an open PR to make the analogous change.

Once py2 is dropped, the change is definitely right. Until then, there are conflicting consistency arguments.

We should either get the Timestamp PR in for 0.24.0 or revert the Timedelta PR until after 0.24.0

(Typing with thumbs; LMk if unclear)

@jreback
Copy link
Contributor Author

jreback commented Dec 11, 2018

i think let’s revert the Timedelta one, then push them both in for 0.25/1.0 (py3 only)

@jorisvandenbossche
Copy link
Member

Moving this comment #24227 (comment) here:

(for which IMO we will also need at least a couple of weeks in master)

[Tom] Just to verify, we should do a release candidate with DatetimeArray ASAP, right? And then 1-2 weeks on master while the RC is out?

Personally, no, I wouldn't do that (if you mean with ASAP like a few days after). I would also keep it at least 2 weeks in master before doing an RC. Now in practice it will maybe be like that anyway ..

@TomAugspurger
Copy link
Contributor

Personally, I'll have to get back to dask / other things after this push on datetimearray. I was hoping we could have an RC out while I do that.

Are there other major issues I could pick up while we're going through this round of reviews? My plate currently has

@jreback
Copy link
Contributor Author

jreback commented Dec 13, 2018

yeah i think we should merge things (that tom mentioned) then sit in master for a week or 2 at the least

@jorisvandenbossche
Copy link
Member

To be clear, I also want to see this released as soon as possible, but we also need to be realistic (eg I don't think we will have a final release before the end of the year as you mentioned on fastparquet? even if all the blocking PRs are merged in a week that seems too quick IMO)

If we have a longer RC period, and still do some further clean-up after doing the RC (and possibly do a second RC), I am fine with doing a quick RC after merging.
But if we see an RC as "ready to be released from our part, if no major issue is reported by people trying out the RC, we can do a final release from that", then we should have those major changes in master for a bit IMO.

@TomAugspurger
Copy link
Contributor

I think I have the permissions to do a fastparquet release. There's a backwards / forwards compatible change that could be released today.

But if we see an RC as "ready to be released from our part, if no major issue is reported by people trying out the RC, we can do a final release from that", then we should have those major changes in master for a bit IMO.

If we have a longer RC period, and still do some further clean-up after doing the RC (and possibly do a second RC), I am fine with doing a quick RC after merging.

That's basically where I'm at. Assuming that outstanding big PRs are merged this week (just assuming, not an actual deadline), then we will turn up issues with them in the next couple weeks. My hope is that by doing an RC (or two) we'll be more likely to turn up more issues, so that we can do a higher-quality final release sooner.

The main cost of doing the RC sooner is that we don't get any more scope creep, which may be a good thing :)

@TomAugspurger
Copy link
Contributor

That said, I don't think doing an RC any time in, say, the 20th - 27th is a good idea because of the holidays. So I'm also fine with doing one shortly after the New Years.

@jreback
Copy link
Contributor Author

jreback commented Dec 13, 2018

that all sounds good ; RC first of the year;

@h-vetinari
Copy link
Contributor

@jreback @TomAugspurger @jorisvandenbossche
Would you accept a PR for #22724 before the cutoff? I know you want to avoid scope creep and get this out of the door soon, but I'm coming from the consistency side of things, where this is a change I think could be beneficial to have sooner rather than later. Thought I'd ask before I invest the time.

Speaking of which - do you already have an idea what will be the policy for breaking changes between v0.24 and v0.25? Will they be blocked completely, or will master move to 1.0.0.dev immediately, with v0.25 using backports?

@h-vetinari
Copy link
Contributor

@jreback @TomAugspurger @jorisvandenbossche

Speaking of which - do you already have an idea what will be the policy for breaking changes between v0.24 and v0.25? Will they be blocked completely, or will master move to 1.0.0.dev immediately, with v0.25 using backports?

Re-asking this, since - in case breaking PRs would be blocked until v.0.25 is released - I would suspend all work on breaking PRs.

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2019

clearing the decks, pls don't tag for 0.24.0 unless an immediate merge. excluding cleanups still being done by @jbrockmendel and @TomAugspurger

ideally could do rc1 say next week, @TomAugspurger ?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 4, 2019 via email

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2019

yep me too

@TomAugspurger
Copy link
Contributor

Re-asking this, since - in case breaking PRs would be blocked until v.0.25 is released - I would suspend all work on breaking PRs.

My preference is that the only API-breaking changes from 0.25 -> 1.0 is the removal of previously deprecated features. Then users can

  1. Ensure things run smoothly on 0.25.x
  2. Fix any FutureWarnings from pandas
  3. Confidently upgrade to 1.0

IIRC there was loose agreement to this at the last dev meeting.

@jreback
Copy link
Contributor Author

jreback commented Jan 10, 2019

ok the RC, yeah small things are ok. If we have big then maybe need RC2

@TomAugspurger
Copy link
Contributor

Tagging now, and going through the local tests before pushing the tag. Ping me if you find a last-minute blocker.

@jorisvandenbossche
Copy link
Member

@TomAugspurger you mentioned you were going to write a blog post for the release, but I suppose only for the final release?
In that case, it might be good to already have some highlights (I started drafting some); the whatsnew file can use some clean-up in general.

@TomAugspurger
Copy link
Contributor

Any idea how long it'll take? I was just about to push the tag :)

Though, I can rebuild the docs that'll be pushed to the web server from a different commit.

@TomAugspurger
Copy link
Contributor

You've got a bit more time, since I think I need to do some work on our conda-forge recipe to ensure numpy >= 1.12 :)

@jorisvandenbossche
Copy link
Member

Yeah, don't wait for me. I have some other work to finish first.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 11, 2019 via email

@jorisvandenbossche
Copy link
Member

I started going through the whatsnew file to extract highlights, and already thought of:

  • Refactor of the internal handling of custom data types:
    • Better integration of the ExtensionArray interface
    • Period and Interval can now be stored in Series / DataFrame columns (before only in Index)
  • New .array attribute on Series and Index to access the underlying values, and to_numpy method to convert to numpy arrays.
  • Optional integer NA
  • sparse changes

Is there any highlight to mention for all the datetime-like refactoring? (apart from "refactor of the internal handling of custom data types")

Any other new features or changes worth mentioning?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 11, 2019 via email

@TomAugspurger
Copy link
Contributor

Binaries are building and HTML docs are up at http://pandas.pydata.org.

Will send out an announcement later today, once the binaries are done.

@TomAugspurger
Copy link
Contributor

Mac and Linux wheels are on PyPI. Conda packages are trickling into conda-forge, and I sent out the announce email.

https://github.com/pandas-dev/pandas-release has a few things that need fixing up. Some are RC specific, so I'm not too worried about them. I'll try to iron out all the final issues while doing the final release, and then hopefully someone else can try things out, to see what machine-specific stuff I've accidentally encoded in there.

@stonebig
Copy link
Contributor

stonebig commented Jan 12, 2019

no windows wheel for 0.24.0rc1 ?

@TomAugspurger
Copy link
Contributor

I’m not sure if @cgohlke typically builds and uploads release candidates.

@cgohlke
Copy link
Contributor

cgohlke commented Jan 12, 2019

I'm on it. The pypy build was failing...

@TomAugspurger
Copy link
Contributor

Thanks @cgohlke, windows wheels are up on PyPI now.

@TomAugspurger
Copy link
Contributor

We should probably do 0.24.0 this week. Any objections? Any blockers?

I don't know if I'll get #24674 done. Won't have too much time this week.

@jreback
Copy link
Contributor Author

jreback commented Jan 20, 2019

no objections - like to get what is marked currently for 0.24.0 in but if in a couple of days they r not then ok to defer

@jreback
Copy link
Contributor Author

jreback commented Jan 23, 2019

@TomAugspurger all issues & PR's are clean for 0.24.0

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 23, 2019 via email

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 24, 2019 via email

@TomAugspurger
Copy link
Contributor

Down to

If people have quick thoughts on #24926 (removing IntervalArray from the top-level, just using pd.arrays.IntervaArray) then a quick +/-1 over there would be useful.

@jorisvandenbossche
Copy link
Member

@TomAugspurger Before tagging, can you do a last setting of the date in the whatsnew docs? (now still January XX) Or in the release commit

@jorisvandenbossche
Copy link
Member

All merged I think!

@TomAugspurger
Copy link
Contributor

Thanks, tagging.

@jreback
Copy link
Contributor Author

jreback commented Jan 25, 2019

nice @TomAugspurger

@zaneselvans
Copy link

Wooo! Congratulations. Thank you for all the hard work. Really looking forward to the release.

@TomAugspurger
Copy link
Contributor

sdist and binaries are up on PyPI and conda-forge. Anaconda is building for defaults now.

Thanks everyone.

@jorisvandenbossche
Copy link
Member

And thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants