You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When collecting an iterable into a column based table, it is not obvious what array type to use for the various elements. For example CategoricalValue from CategoricalArrays should clearly be collected in a CategoricalArray that is optimized for that type, WeakRefString from WeakRefStrings belongs in a StringArray (which is optimized to store those), DataValue naturally belongs to a DataValueArray etc.
This makes it very hard to write code that would collect an iterable into its "optimized container" without depending on all the above packages (or using Requires like here), which in my view is a design that does not scale. I feel that this could be solved by adding a defaultarray(T, sz) = Array{T}(undef, sz) function in Base that the various packages (CategoricalArrays, WeakRefStrings, DataValueArrays) could then overload. In this way one could write a collect optimized for the element type without any dependency.
The text was updated successfully, but these errors were encountered:
I'm unfamiliar with the broadcasting machinery, but if as I imagine there is something similar to defaultarray there to determine whether to collect things as Array or as BitArray, we could just expose that interface and allow packages to overload it for their custom type (so that broadcasting a function that returns a CategoricalValue would return a CategoricalArray).
I'm unfamiliar with the broadcasting machinery, but if as I imagine there is something similar to defaultarray there to determine whether to collect things as Array or as BitArray, we could just expose that interface and allow packages to overload it for their custom type (so that broadcasting a function that returns a CategoricalValue would return a CategoricalArray).
Good idea. Currently broadcastrelies on similar(::Broadcasted{DefaultArrayStyle{N}}, ::Type) for this, but that's really equivalent to using a custom function since it's completely different from similar(::AbstractArray, ::Type).
When collecting an iterable into a column based table, it is not obvious what array type to use for the various elements. For example
CategoricalValue
from CategoricalArrays should clearly be collected in aCategoricalArray
that is optimized for that type,WeakRefString
from WeakRefStrings belongs in aStringArray
(which is optimized to store those),DataValue
naturally belongs to aDataValueArray
etc.This makes it very hard to write code that would collect an iterable into its "optimized container" without depending on all the above packages (or using Requires like here), which in my view is a design that does not scale. I feel that this could be solved by adding a
defaultarray(T, sz) = Array{T}(undef, sz)
function in Base that the various packages (CategoricalArrays, WeakRefStrings, DataValueArrays) could then overload. In this way one could write acollect
optimized for the element type without any dependency.The text was updated successfully, but these errors were encountered: