Skip to content

Releases: JuliaData/DataFrames.jl

v0.21.0

05 May 16:06
d9bd90a
Compare
Choose a tag to compare

DataFrames v0.21.0

Diff since v0.20.2

Closed issues:

  • Output format of reshape functions (#645)
  • Grouping API consistency and improvements (#1256)
  • by with arrays of inputs and broadcasting (#1615)
  • Group Indices function (#1704)
  • push! which promotes type (#1716)
  • API for groupwise column transformation (#1727)
  • aggregate function: Add option to NOT re-name column (#1756)
  • Two regex-related items on a wish list (#1849)
  • How about DismensionMismatch rather than ArgumentError? (#1879)
  • Handling of strings for column indexing (#1926)
  • How to perform by on two variables? Should we auto-splat? (#1935)
  • Row-wise vs. whole vector functions (#1952)
  • API for aggregate (#1953)
  • Unify push!, append! and vcat implementation. (#2032)
  • Add an easy way to get a number of rows in by (#2035)
  • Column naming in combine (#2071)
  • Implicit broadcasting rules (#2086)
  • Add "begin" tests for Julia 1.4 (#2089)
  • redesign of eachcol (#2090)
  • Reconsider overloading Base.join? (#2092)
  • Using the public API should be safe (#2094)
  • Speed up key lookup in GroupedDataFrame (#2095)
  • Add Tables.namedtupleiterator implementation (#2100)
  • Decide if we want to copy levels of CategoricalValue if we do Tables.allocatecolumn (#2104)
  • Add transform function (#2110)
  • Should we export Tables (#2114)
  • Optionally remove type from heading (#2116)
  • Disallow passing zero columns to aggregate functions in combine/by (#2118)
  • DataFrame constructor from Dict (#2119)
  • Add a wrapper type for passing named tuples to functions when transforming (#2121)
  • Constructor behavior on nested array vs array of tuple (#2124)
  • Unexpected BoundsError message (#2125)
  • map(DataFrame, groups) does not return a collection of DataFrames (#2126)
  • Bad deprecation warning for df.C = "c" when C is a new column (#2129)
  • Data Manipulation - Map categorical value (#2130)
  • ERROR: BoundsError: attempt to access String (#2134)
  • Circular reference in DataFrame bug (#2135)
  • Groupby + count append 0 if not exists (#2136)
  • Fix bounds in registry (#2137)
  • Pandas like MultiIndex function (#2138)
  • ⍰ character in header output (#2139)
  • review consistency of @view semantics (#2143)
  • Precompilation error with latest release on julia v1.4-rc2 (#2149)
  • add missing columns when push! ing? (#2150)
  • Performance of allunique (#2153)
  • Redesign of combine (#2156)
  • Emulating Stata's rowtotal (#2161)
  • Automatically fill in select and combine for scalars? (#2162)
  • [BREAKING] Making combine more flexible (#2166)
  • Sync API of append! with push! (#2173)
  • Add a method to insert a column to last index (#2175)
  • sort and sort! API (#2178)
  • push! with cols=:subset is not allowed if there is no missing in previous data (#2179)
  • unrelated error message when trying to access 0 index (#2182)
  • Base.depwarn stopped printing warnings on Julia 1.5 (#2184)
  • New name for names and rename! (#2185)
  • Shall mapcols be deprecated? (#2186)
  • Why does categorical!(df::DataFrame, ...) exist? (#2192)
  • Extend categoric values by category data (#2198)
  • Tag a new release to reflect [compat] CategoricalArrays = "0.8" update (#2204)
  • Cleaner syntax (#2206)
  • by does not generate correct results (#2208)
  • possible test failure in upcoming Julia version 1.5 (#2221)

Merged pull requests:

v0.20.2

13 Feb 13:11
v0.20.2
67701cb
Compare
Choose a tag to compare

v0.20.2 (2020-02-13)

Diff since v0.20.1

Closed issues:

  • pkg\> add DataFrames Tables installs DataFrames v0.19.4 and Tables v1.0.0 (#2112)
  • Sync with Tables.jl 1.0 release (#2109)

v0.20.1

13 Feb 09:45
v0.20.1
Compare
Choose a tag to compare

Add compatibility with Tables.jl v1.0.

v0.20.0

07 Dec 19:19
v0.20.0
025824f
Compare
Choose a tag to compare

v0.20.0 (2019-12-07)

Diff since v0.19.4

Closed issues:

  • Make describe not accept io (#2024)
  • Switch cols to kwarg from positional args (#2023)
  • Problems sorting dataFrames imported from CSV (#2019)
  • Allow rename!\(df, pair::Pair{String, String}\) as a signature (#2017)
  • Add an argument allowing to select columns to calculate statistics on for describe (#2014)
  • Add flatten function (#2012)
  • describe should also apply to Vector (#2010)
  • Add :equal support in append! (#2007)
  • CSV.read cannot detect "Time" type string (#2005)
  • When vcat dataframes, ordering of categorical variables is lost (#2002)
  • Allow mix of Symbol and Pair in join (#2001)
  • Documenting the difference betwen df[!, :col] and df[:, :col] (#1999)
  • select!(df, Not(tuple)) does not work (#1997)
  • using DataFrames in Jupiter lab and notebook hangs... (#1996)
  • [package code fancyness] Redundant code snippet (#1993)
  • Merge meanings of cols keyword arguments between push! and vcat (#1991)
  • master still on v0.19.3, though release branch already on v0.19.4 (#1990)
  • Bad performance of "by" function for random queries (#1988)
  • Warning T is deprecated, use nonmissingtype instead (#1987)
  • ERROR: ArgumentError: 'Array{UInt8,1}' iterates 'UInt8' values, which don't satisfy the Tables.jl Row-iterator interface (#1983)
  • vcat! or push!(..., columns=:union) (#1982)
  • Drop AppVeyor in favour of TravisCI (#1980)
  • 32-bit BoundsError (#1978)
  • allow by to receive keyword argument for custom output column name. (#1976)
  • dropmissing! fails on PooledStrings (#1973)
  • Fix tests to pass on Julia nightly (#1967)
  • can't join more than two dataframes? (#1962)
  • Issues using the df\[!, col\] syntax during broadcasts (#1959)
  • API for functions that help reduce memory usage (#1954)
  • NamedTuple backing or switchable? (#1949)
  • Sorting error using examples from docs (#1945)
  • merge names! into rename!` (#1943)
  • Allow partial re-ordering for permutecols! (#1942)
  • sort! performance (#1927)
  • Add kwarg do disallowmissing that skips conversion of columns with missing values (#1922)
  • Sync the behavior of push!, vcat and append! in DataFrames.jl with Base (#1904)
  • How about raising ArgumentError rather than just calling error() in append!()? (#1869)
  • How about raise ArgumentError rather than just calling error() in insertcols!()? (#1867)
  • Improve select and select! performance with Not (#1861)
  • Make getproperty\(df, col\) return a full length view of the column (#1844)
  • Allow empty keys argument in by\(\) (#1837)
  • Find a better API for stackdf and meltdf (#1736)
  • DataFrames.jl roadmap (#1678)
  • setindex!/broadcast! design (#1645)
  • Update docstrings to new conventions (#1093)

Merged pull requests:

v0.19.4

08 Sep 07:51
v0.19.4
8e79a5c
Compare
Choose a tag to compare

Make DataFrames.jl to depend on Missings.jl version 0.4.2.
Stop using Missings.T internally and use nonmissingtype instead.

v0.19.3

26 Aug 09:06
v0.19.3
a9b83b5
Compare
Choose a tag to compare
  • fixed a bug in deprecation code for when setindex! was passed a 1-row DataFrame as the right hand side (note though this syntax is not recommended to be used);
  • DataFrameRows and DataFrameColumns now support getproperty and propertynames and have a custom printing that shows them similarly like data frames
  • categorical and categorical! now accept Type as cols argument, so that the user can flexibly decide which columns are converted to categorical based on their type
  • columnindex function from Tables.jl is now exported
  • All and Between can now be used for indexing columns of a data frame
  • passing a Tuple as a on keyword argument is deprecated (use Pair instead)
  • minor documentation and build system improvements

v0.19.2

23 Aug 16:53
v0.19.2
d3412e9
Compare
Choose a tag to compare

New features

  • added disallowmissing, allowmissing and categorical functions
  • unstack now accepts renamecols keyword argument

Minor changes

  • documentation has been updated to reflect new indexing rules
  • setindex! deprecation warnings were improved and now take into account new rules of broadcasting into 0-row data frames
  • :eltype column in describe now contains a true element type of a data frame column (previously if the type was an union with Missing then the Missing part was stripped, which sometimes lead to user confusion)
  • broadcasting over GroupedDataFrame is now disallowed (it was never intended to work; in the future this might be allowed, but the target design is not decided upon yet)
  • append! now throws ArgumentError instead of ErrorException when column names of appended data frame does not match the target

Bug fixes

  • fixed a typo in append! error message
  • fixed a bug in categorical! function when a Colon as column selector was passed (the behavior was inconsistent with the documentation); now only categorical!(::DataFrame) changes columns whose eltype is <:Union{AbstractString, Missing} to categorical; any valid categorical!(::DataFrame, cols) call changes all columns selected by cols to categorical

v0.19.1

26 Jul 13:31
v0.19.1
ed25099
Compare
Choose a tag to compare

v0.19.1 (2019-07-25)

Changes summary

  • correctly handle broadcasting into a single cell of a data frame; now df[row, col] .= v broadcasts into the object held in df[row, col] cell
  • we now allow broadcasting into empty data frame (data frame df for which isempty(df) is true); in particular we allow column creation in empty DataFrame in which case we always create a 0-row vector;
  • push! and append! now make sure that the result of the operation did not corrupt the data frame (which mostly happens when there are column aliasing issues) and throw an error if it happens (this introduces a small overhead but greatly reduces a number of possible bugs in user code)
  • join, groupby and show-related functions now check if the data frames passed to them are consistent (have the same number of rows in each column and do not have corrupted index)
  • improved loading time by replacing StatsBase.jl by DataAPI.jl dependency
  • fixed documentation generation issues

Diff since v0.19.0

Closed issues:

  • Explanation of the deprecation of df[col] and df[cols] (#1897)
  • Basics docs are broken on master (#1891)
  • Preventing problems with aliased columns (#1885)
  • dropna (#1884)
  • Release 0.19.0 (#1883)
  • describe StatsBase vs DataAPI (#1882)
  • Issue when converting Excelfile with missing data to DataFrame (#1878)
  • select and deletecols for SubDataFrame and DataFrameRow (#1825)
  • Surprising setindex/setproperty behaviour (#1815)
  • Convience methods for getproperty and setproperty in DataFrames with new ownership rules (#1753)
  • pairs outputs warnings (#1751)
  • Add checks of DataFrame consistency before expensive operations (#1744)
  • DataFrames should be indexable by CartesianIndex{2} (#1610)
  • allow vcat to widen columns (#1574)
  • Creating an empty DataFrame is unwieldy and has unexpected behavior (#1569)
  • Make a DataFrame not iterable (#1513)
  • broadcasted setindex not working as expected (#1507)
  • Writing to latex: omitting column with row index (#1381)
  • functional interface for deleting columns (#1378)
  • Implement a from_records constructor (#1191)
  • Markdown display (#1167)
  • implement view(::DataFrame, ...) to support broadcasted assignment (#1019)
  • JSON to dataframe input and output (#873)
  • Implement Base.cor for DataFrame (#583)
  • Remove nrow/ncol (#406)

Merged pull requests:

  • Improve documentation generation (#1892) (bkamins)
  • allow scalar broadcasting into an empty data frame (#1890) (bkamins)
  • First proposal of consistency checks (#1887) (bkamins)
  • Extend describe from DataAPI to allow removing StatsBase dependency (#1818) (quinnj)

v0.19.0

15 Jul 14:34
v0.19.0
b0d8a87
Compare
Choose a tag to compare

API changes:

  • allow Regex indexing of columns
  • allow Not from InvertedIndices.jl indexing of rows and columns
  • add ! indexing of rows of AbstractDataFrame
  • deprecate indexing with column or columns only (like df[:a] or df[1:2])
  • define target rules for getindex, getproperty, setindex!, and setproperty!forAbstractDataFrameandDataFrameRow` (in this release old behavior is deprecated; in the next release wit will get replaced by target functionality)
  • add indexing using CartesianIndex{2} for AbstractDataFrame
  • full support of broadcasting for AbstractDataFrame
  • support for broadcasting assignment for DataFrameRow
  • keys(::DataFrameRow) now returns a Tuple of column names
  • added get and map methods for DataFrameRow
  • categorical! now accepts columns that contain missing values
  • get and haskey for AbstractDataFrame is deprecated now
  • empty! for DataFrame is deprecated now
  • add hasproperty for AbstractDataFrame

Fixes:

  • improved showind DataFrameRow with zero columns
  • fix combine with aggregation when skipmissing=true

Minor changes:

  • improvements in error messages and types of thrown exceptions on error
  • various documentation improvements
  • improved getindex speed for vector of Bool indexing
  • remove InteractiveUtils.jl dependency

v0.18.4

03 Jul 08:44
v0.18.4
0c1e4a6
Compare
Choose a tag to compare

Changes since last release:

  • Fix combine with aggregation when skipmissing=true
  • Remove InteractiveUtils load