Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes in DataFrames master #909

Closed
amellnik opened this issue Oct 5, 2016 · 8 comments
Closed

Breaking changes in DataFrames master #909

amellnik opened this issue Oct 5, 2016 · 8 comments

Comments

@amellnik
Copy link
Contributor

amellnik commented Oct 5, 2016

Opening this issue to track everything related to the adoption of Nullables in DataFrames (JuliaData/DataFrames.jl#1092). This breaks a lot of the DataFrame-related operations in Gadfly since it doesn't know how to handle Nullables:

using DataFrames, Gadfly
df = DataFrame(Fish =["Bob", "Amir"], Mass = [1.2, 3.4])
plot(df, x=:Fish, y=:Mass, Geom.point)
MethodError: no method matching isfinite(::Nullable{Float64})
Closest candidates are:
  isfinite(!Matched::Float16) at float16.jl:119
  isfinite(!Matched::BigFloat) at mpfr.jl:799
  isfinite(!Matched::DataArrays.NAtype) at C:\Users\amellnik\.julia\v0.5\DataArrays\src\predicates.jl:9
  ...

 in apply_statistic_typed(::Nullable{Float64}, ::Nullable{Float64}, ::Array{Nullable{Float64},1}, ::Array{Void,1}, ::Array{Void,1}) at C:\Users\amellnik\.julia\v0.5\Gadfly\src\statistics.jl:956
 in apply_statistic(::Gadfly.Stat.TickStatistic, ::Dict{Symbol,Gadfly.ScaleElement}, ::Gadfly.Coord.Cartesian, ::Gadfly.Aesthetics) at C:\Users\amellnik\.julia\v0.5\Gadfly\src\statistics.jl:810
 in apply_statistics(::Array{Gadfly.StatisticElement,1}, ::Dict{Symbol,Gadfly.ScaleElement}, ::Gadfly.Coord.Cartesian, ::Gadfly.Aesthetics) at C:\Users\amellnik\.julia\v0.5\Gadfly\src\statistics.jl:38
 in render_prepare(::Gadfly.Plot) at C:\Users\amellnik\.julia\v0.5\Gadfly\src\Gadfly.jl:746
 in render(::Gadfly.Plot) at C:\Users\amellnik\.julia\v0.5\Gadfly\src\Gadfly.jl:799
 in show at C:\Users\amellnik\.julia\v0.5\Gadfly\src\Gadfly.jl:965 [inlined]
 in limitstringmime(::MIME{Symbol("image/svg+xml")}, ::Gadfly.Plot) at C:\Users\amellnik\.julia\v0.5\IJulia\src\execute_request.jl:31
 in display_dict(::Gadfly.Plot) at C:\Users\amellnik\.julia\v0.5\IJulia\src\execute_request.jl:48
 in execute_request(::ZMQ.Socket, ::IJulia.Msg) at C:\Users\amellnik\.julia\v0.5\IJulia\src\execute_request.jl:200
 in eventloop(::ZMQ.Socket) at C:\Users\amellnik\.julia\v0.5\IJulia\src\eventloop.jl:8
 in (::IJulia.##9#15)() at .\task.jl:360
@nalimilan
Copy link
Contributor

Looks like you might want to fix this using the lifting strategy supported by StructuredQueries.jl (davidagold/StructuredQueries.jl#25).

@tlnagy
Copy link
Member

tlnagy commented Oct 5, 2016

Looks like you might want to fix this using the lifting strategy supported by StructuredQueries.jl (davidagold/StructuredQueries.jl#25).

@nalimilan Is this something we're going to have to fix on Gadfly's side?

@nalimilan
Copy link
Contributor

I'm afraid yes. I don't know how that code works, but basically it will have to accept and work with Nullable{T} in addition to T values. Hopefully the new framework will make this easier.

@tlnagy
Copy link
Member

tlnagy commented Oct 5, 2016

Which new framework? Ideally we wouldn't need to change much on Gadfly's side.

@nalimilan
Copy link
Contributor

nalimilan commented Oct 6, 2016

I meant StructuredQueries.jl. Though I've just had a look at the code from the backtrace, and you already have a special case for DataArray, so with pretty minimal changes it should work.

You should be able to replace that with a special case on AbstractArray{<:Nullable}, which would change vals.na[i] to isnull(vals[i]) and vals.data[i] with get(vals[i]). Then you wouldn't even need to depend on NullableArrays. If performance really matters in that code, though, you can replace these with NullableArray-specific code, i.e. isnull(vals, i) and values(vals, i).

@tlnagy
Copy link
Member

tlnagy commented Oct 6, 2016

The addition of NullableArrays as a dependency wouldn't be a big problem because DataFrames will depend on it anyway. This will be something that we will need to fix before the next version of Gadfly is released.

@ararslan
Copy link

In case you haven't seen, @tlnagy, DataFrames has now been made to work on 0.6 and will continue to be maintained as-is, i.e. based on DataArrays. The NullableArrays-based backend that previously occupied the DataFrames master branch has been moved to a separate package: DataTables. So you don't have to worry about these changes for now.

Eventually we will have a package that provides sufficient abstraction over a table-like interface that packages can code against the abstraction, then end users can simply plug in DataFrames or a similar implementation and it will "just work." Once that lands, I would encourage Gadfly to use that approach as well, allowing you to drop the DataFrames requirement altogether.

@amellnik
Copy link
Contributor Author

amellnik commented Jan 2, 2018

Closing as per Alex's comment -- the Nullable-based DataFrames was never released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants