Skip to content

Releases: JuliaData/DataFrames.jl

v0.22.1

20 Nov 20:23
8386d0a
Compare
Choose a tag to compare

DataFrames v0.22.1

Diff since v0.22.0

Closed issues:

  • eltype width taken into accounet in display even if it is not shown (#2540)
  • Final ellipsis appears on next row (#2544)
  • clarify the interface for crossjoin when makeunique=true (#2545)
  • Two small typos in docs (#2550)

Merged pull requests:

v0.22.0

15 Nov 08:23
ff9577e
Compare
Choose a tag to compare

DataFrames v0.22.0

Diff since v0.21.8

DataFrames v0.22 Release Notes

Breaking changes

  • the rules for transformations passed to select/select!, transform/transform!,
    and combine have been made more flexible; in particular now it is allowed to
    return multiple columns from a transformation function
    (#2461 and
    #2481)
  • CategoricalArrays.jl is no longer reexported: call using CategoricalArrays
    to use it #2404.
    In the same vein, the categorical and categorical! functions
    have been deprecated in favor of
    transform(df, cols .=> categorical .=> cols) and similar syntaxes
    #2394.
    stack now creates a PooledVector{String} variable column rather than
    a CategoricalVector{String} column by default;
    pass variable_eltype=CategoricalValue{String} to get the previous behavior
    (#2391)
  • isless for DataFrameRows now checks column names
    (#2292)
  • DataFrameColumns is now not a subtype of AbstractVector
    (#2291)
  • nunique is not reported now by describe by default
    (#2339)
  • stop reordering columns of the parent in transform and transform!;
    always generate columns that were specified to be computed even for
    GroupedDataFrame with zero rows
    (#2324)
  • improve the rule for automatically generated column names in
    combine/select(!)/transform(!) with composed functions
    (#2274)
  • :nmissing in describe now produces 0 if the column does not allow
    missing values; earlier nothing was produced in this case
    (#2360)
  • fast aggregation functions in for GroupedDataFrame now correctly
    choose the fast path only when it is safe; this resolves inconsistencies
    with what the same functions not using fast path produce
    (#2357)
  • joins now return PooledVector not CategoricalVector in indicator column
    (#2505)
  • GroupKeys now supports in for GroupKey, Tuple, NamedTuple and dictionaries
    (2392)
  • in describe the specification of custom aggregation is now function => name;
    old name => function order is now deprecated
    (#2401)
  • in joins passing NaN or real or imaginary -0.0 in on column now throws an
    error; passing missing thows an error unless matchmissing=:equal keyword argument
    is passed (#2504)
  • unstack now produces row and column keys in the order of their first appearance
    and has two new keyword arguments allowmissing and allowduplicates
    (#2494)
  • PrettyTables.jl is now the
    default back-end to print DataFrames to text/plain; the print option
    splitcols was removed and the output format was changed
    (#2429)

New functionalities

  • add filter to GroupedDataFrame (#2279)
  • add empty and empty! function for DataFrame that remove all rows from it,
    but keep columns (#2262)
  • make indicator keyword argument in joins allow passing a string
    (#2284,
    #2296)
  • add new functions to GroupKey API to make it more consistent with DataFrameRow
    (#2308)
  • allow column renaming in joins
    (#2313 and
    (#2398)
  • add rownumber to DataFrameRow (#2356)
  • allow passing column name to specify the position where a new columns should be
    inserted in insertcols! (#2365)
  • allow GroupedDataFrames to be indexed using a dictionary, which can use Symbol or string keys and
    are not dependent on the order of keys. (#2281)
  • add isapprox method to check for approximate equality between two dataframes
    (#2373)
  • add columnindex for DataFrameRow
    (#2380)
  • names now accepts Type as a column selector
    (#2400)
  • select, select!, transform, transform! and combine now allow renamecols
    keyword argument that makes it possible to avoid adding transformation function name
    as a suffix in automatically generated column names
    (#2397)
  • filter, sort, dropmissing, and unique now support a view keyword argument
    which if set to true makes them retun a SubDataFrame view into the passed
    data frame.
  • add only method for AbstractDataFrame (#2449)
  • passing empty sets of columns in filter/filter! and in select/transform/combine
    with ByRow is now accepted (#2476)
  • add permutedims method for AbstractDataFrame (#2447)
  • add support for Cols from DataAPI.jl (#2495)

Deprecated

  • DataFrame! is now deprecated (#2338)
  • several in-standard DataFrame constructors are now deprecated
    (#2464)
  • all old deprecations now throw an error
    (#2350)

Dependency changes

  • Tables.jl version 1.2 is now required.
  • DataAPI.jl version 1.4 is now required. It implies that All(args...) is
    deprecated and Cols(args...) is recommended instead. All() is still supported.

Other relevant changes

  • Documentation is now available also in Dark mode
    (#2315)
  • add rich display support for Markdown cell entries in HTML and LaTeX
    (#2346)
  • limit the maximal display width the output can use in text/plain before
    being truncated (in the textwidth sense, excluding ) to 32 per column
    by default and fix a corner case when no columns are printed in situations when
    they are too wide (#2403)
  • Common methods are now precompiled to improve responsiveness the first time a method
    is called in a Julia session. Precompilation takes up to 30 seconds
    after installing the package
    (#2456).

Closed issues:

  • Allow to hide row numbers (#592)
  • Stop printing row numbers in show(io, df)? (#864)
  • Show a (kind of) transposed DataFrame (#2065)
  • Improve text/plain show for AbstractDataFrame (#2146)
  • Showing of very wide data frames (#2302)
  • Add PrettyTables.jl as an alternative backend for display in DataFrames.jl (#2337)
  • add transpose(df, src_namescol, dst_namescol) (#2420)
  • Deprecate DataFrame(::AbstractMatrix) (#2433)
  • Always use ? for Union{T, Missing} (#2480)
  • Stop supporting broadcasting + against whole DataFrames (#2483)
  • clean-up unstack (#2485)
  • Join on index with compatible Unitful types (#2486)
  • ERROR: UndefVarError: ByRow not defined (#2493)
  • Explicitly handling missingness in join columns (#2499)
  • sort with by accepts tuples still (#2500)
  • innerjoin not working if one df is a SubDataFrame or item of GroupedDataFrame (#2502)
  • remaining dependencies on CategoricalArrays (#2506)
  • Immutable DataFrames (#2507)
  • general principles of data manipulation for dicussion (#2509)
  • create maprow to be complementary with mapcol (#2510)
  • insertcols!(df, values => :name ) (#2512)
  • [Feature request] Support for converting single-column dataframes to Vectors (#2526)
  • Sync tests with Tables 1.2 (#2529)
  • select does not have method to handle Pair? (#2531)
  • Warning: getindex(df::DataFrame, col_ind::ColumnIndex) is deprecated (#2532)
  • ERROR: The following package names could not be resolved: (#2534)

Merged pull requests:

Read more

v0.21.8

12 Oct 17:14
Compare
Choose a tag to compare

DataFrames v0.21.8

Fix a bug in select/select!/transform/transform! in case when a GroupedDataFrame containing reordered groups is processed.

v0.21.7

25 Aug 00:29
Compare
Choose a tag to compare

DataFrames v0.21.7

Merged pull requests:

v0.21.6

05 Aug 20:10
Compare
Choose a tag to compare

DataFrames v0.21.6

Diff since v0.21.5

Merged pull requests:

v0.21.5

27 Jul 11:08
Compare
Choose a tag to compare

DataFrames v0.21.5

Diff since v0.21.4

Closed issues:

  • How to save dataframe to CSV file (#2312)
  • Single row manipulation leads to wrong results (#2332)

Merged pull requests:

v0.21.4

30 Jun 15:06
Compare
Choose a tag to compare

DataFrames v0.21.4

Diff since v0.21.3

Closed issues:

  • findfirst/findlast/nextind/prevind not working for eachcol in v0.21.0 (#2229)
  • What happened to by()? (#2306)
  • Error in showing DataFrame with Distribution column (#2310)

Merged pull requests:

v0.21.3

23 Jun 22:06
Compare
Choose a tag to compare

DataFrames v0.21.3

Diff since v0.21.2

Closed issues:

  • When join(..., validate=(true,true)) fails, it should include a list of non-unique joinkeyrows in the error) (#1732)
  • Unify error messages for setting index of subdataframe (#2277)
  • indicator column in joins should allow Strings (#2283)
  • no method matching iterate(::InvertedIndex{BitArray{1}}), trying to any() a BitArray (#2285)
  • Fatal error: ERROR: UndefVarError: identifier not defined (#2286)
  • show all columns at HTML in Jupyter notebook (#2293)
  • unable to touch doc website (#2295)
  • select/transform: old_column => fun => new_column_name syntax (#2301)

Merged pull requests:

v0.21.2

01 Jun 15:06
Compare
Choose a tag to compare

DataFrames v0.21.2

Diff since v0.21.1

Closed issues:

  • diag(::DataFrame)? (#2268)

Merged pull requests:

v0.21.1

22 May 13:04
Compare
Choose a tag to compare

DataFrames v0.21.1

Diff since v0.21.0

Closed issues:

  • Standardizing working with multiple columns (#2016)
  • In docs, note subsets are copies (unless of columns)? (#2224)
  • first/last/etc. documentation problem (#2232)
  • Make DataFrame's BoundsError message more informative and similar to that of Base.Matrix (#2234)
  • Problems in groupreduce_init (#2241)
  • Tables.columns should return a DataFrameColumns object (#2244)
  • map on DataFrameColumns should return DataFrameColumns (#2245)
  • rename(uppercase, df) doesn't work anymore (#2252)
  • update docs at pkg.julialang.org (#2255)
  • allow [:a,:b,:c,:d] => fun => new_column_name syntax (#2256)
  • ENV["COLUMNS"] not working as expected in Jupyter Lab (#2266)

Merged pull requests: