Release 0.22 tracking #2484

bkamins · 2020-10-13T21:23:14Z

nalimilan · 2020-10-15T13:01:27Z

Just a note that we should decide whether we change the way missing values are handled by default. Not saying that we should, but we must be aware that if we release 0.22/1.0 without it then it won't happen until 2.0 (if at all).

bkamins · 2020-10-15T13:52:14Z

Just a note that we should decide whether we change the way missing values are handled by default.

My understanding was the conclusion was to add where as I described above and this is the only change we need. What additional changes do you see required? (in particular I understand that we concluded that adding a functionality that conditionally updates/creates a column only on a subset of rows in a data frame is something that we decided is not crucial to have)

bkamins · 2020-10-18T09:20:32Z

What additional changes do you see required?

@nalimilan + @pdeffebach : do you see any more changes required in 0.22 (except for the described above) in the area of handling missing values?

nalimilan · 2020-10-19T08:54:08Z

My understanding was the conclusion was to add where as I described above and this is the only change we need. What additional changes do you see required? (in particular I understand that we concluded that adding a functionality that conditionally updates/creates a column only on a subset of rows in a data frame is something that we decided is not crucial to have)

One potential issue is that having where skip missing values but other functions like combine, select and transform not skip them would be a bit inconsistent. And while we can add where without breaking anything, changing the behavior of other functions would be breaking. So at least we should decide whether that inconsistency would be a problem before 0.22. Probably better keep discussing this at #2314.

I'm not sure the conclusion regarding what we should do was so clear (you're referring to #2314, right?). One simple and consistent solution that wouldn't break anything would be to say that by default missing values should never be skipped in DataFrames (not even with where), but that they should be skipped by all functions in DataFramesMeta.

bkamins · 2020-10-22T11:00:02Z

OK - all required PRs for 0.22 release are now open

bkamins · 2020-10-24T20:56:48Z

We should also decide on #2499 before 0.22 release.

bkamins · 2020-11-07T21:32:19Z

Today we made a major step towards 0.22. I assume @nalimilan will now work on #2459 and I rewrite manual and docstrings using the new backend. After these two PRs are merged I will open a PR bumping DataFrames.jl to 0.22 to check again if all works as expected.

From this moment, except #2459 we do feature freeze for 0.22 release so testing is welcome!

bkamins · 2020-11-13T15:55:57Z

There is a proposal (which I like) that for 0.22 release all deprecated functionality should have force=true set (so that we make sure users see deprecation warnings). @deprecate does not support force=true, but I can make a substitute for this case. The question is if we agree with this approach.

@oxinabox if I recall correctly, you recommended to drop printing depwarns by default in Julia Base, so your opinion here would be welcome.

oxinabox · 2020-11-13T22:13:35Z

I don't love deprecation warnings being turned off.
It's not my preferred solution.
But the problem was that deprecation warnings can cause immense slow down.
It was effectively breaking for users of DataStructures.jl (LightGraphs and JuMP) because it was so much slower.
I actually ended up having to undeprecate it for them.

If you do force there is no way for the user to save themselves from it. Even if they know it isn't a problem.

They also are transitive to the user.
It makes sense to see the deprecation warnings for your direct dependencies.
It doesn't make sense for your indirect dependencies.
(I would really like to fix that using a LoggingExtras filter. Though I think Base needs to be changed to make sure to set the group on deprecation warnings so they can be identified. But it's not something I have had time to do yet).
A particularly bad case is any kind of long running production system with logs that people need to read.
Deprecation warnings are turned off for a production system, since it's a frozen Manifest in a docker container -- no chance of anything being updated by mistake.
And being flooded by deprecation warnings when you are trying to deal with a production crash is noone's idea of a good time.
(Not too hard to work around since such a logging system has its own filters hopefully. I know SumoLogic does)

Also: it's not a problem to use deprecated methods.
They still work.
We follow semver.
Users should check for deprecations before updating to a new breaking change.
Plus we have depwarn turn on in tests -- which is the place you go to check.

Also allows users to turn off deprecations for tests if they don't want to be spammed so they focus on their actual errors not on the deprecation spam.
Using force is going to be really confusing for folks at Invenia as we run two seperate nightly CI jobs for everything.
One with deprecations turned off, so we can focus on the package itself, and one with it set to error so we can know if it's to update anything.

bkamins · 2020-11-13T22:59:02Z

@oxinabox - this is a valid point (and much appreciated insight from folks using DataFrames.jl in production). I agree that the ideal solution would be to show deprecation for direct dependencies and not show them for indirect ones.

Hopefully we will ship out 1.0 release rather soon (3 to 6 months) after 0.22 release, and all deprecations will be pruned in 1.0.

Having thought of this more note that the original problem with push! was old (actually very old) bad design that we fixed some time ago already. The deprecations in 0.22 are:

DataFrame! (I do not think people are going to be confused - actually users were confused that it exists)
categorical and categorical! (rarely used in practice, will not lead to errors - we simply stop depending on CategoricalArrays.jl in 1.0 release)
bunch of DataFrame constructors (again - not a "bad design", but we will simply stop supporting some, so this should be clean)
convert method from Matrix to DataFrame (same as above)

So all in all:

for 0.22 release it is not a big deal to leave things as is given this evidence
in general a solution in Julia Base for more fine-grained deprecation handling would be nice, but it is an issue for Julia Base

nalimilan · 2020-11-14T14:35:54Z

Yes I agree it's not a big deal if people don't see the deprecations in 0.22. What we can do if we want is to keep some deprecations as plain errors in 1.0 to point users to the new syntax (that won't work for categorical though since we'd need to keep the dependency on CategoricalArrays).

bkamins · 2020-11-14T15:42:22Z

What we can do if we want is to keep some deprecations as plain errors in 1.0

This is what we do in 0.22 with some of the deprecations where it made sense. I think we will do it also for 1.0 release (actually one of the first PRs after 0.22 release will be probably removing the deprecations to have a clean master).

bkamins added the decision label Oct 13, 2020

bkamins added this to the 1.0 milestone Oct 13, 2020

This was referenced Oct 15, 2020

[BREAKING] Add PrettyTables.jl backend for printing DataFrames #2429

Merged

Style fixes #2488

Closed

bkamins mentioned this issue Oct 19, 2020

Skipping missing values more easily #2314

Open

bkamins pinned this issue Nov 1, 2020

bkamins closed this as completed Nov 14, 2020

bkamins unpinned this issue Nov 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.22 tracking #2484

Release 0.22 tracking #2484

bkamins commented Oct 13, 2020 •

edited

Loading

nalimilan commented Oct 15, 2020

bkamins commented Oct 15, 2020

bkamins commented Oct 18, 2020

nalimilan commented Oct 19, 2020

bkamins commented Oct 22, 2020

bkamins commented Oct 24, 2020

bkamins commented Nov 7, 2020

bkamins commented Nov 13, 2020

oxinabox commented Nov 13, 2020 •

edited

Loading

bkamins commented Nov 13, 2020

nalimilan commented Nov 14, 2020

bkamins commented Nov 14, 2020

Release 0.22 tracking #2484

Release 0.22 tracking #2484

Comments

bkamins commented Oct 13, 2020 • edited Loading

nalimilan commented Oct 15, 2020

bkamins commented Oct 15, 2020

bkamins commented Oct 18, 2020

nalimilan commented Oct 19, 2020

bkamins commented Oct 22, 2020

bkamins commented Oct 24, 2020

bkamins commented Nov 7, 2020

bkamins commented Nov 13, 2020

oxinabox commented Nov 13, 2020 • edited Loading

bkamins commented Nov 13, 2020

nalimilan commented Nov 14, 2020

bkamins commented Nov 14, 2020

bkamins commented Oct 13, 2020 •

edited

Loading

oxinabox commented Nov 13, 2020 •

edited

Loading