Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UndefInitializer -> cannot save as BSON / JLD #24

Closed
kafisatz opened this issue Aug 3, 2020 · 5 comments
Closed

UndefInitializer -> cannot save as BSON / JLD #24

kafisatz opened this issue Aug 3, 2020 · 5 comments

Comments

@kafisatz
Copy link

kafisatz commented Aug 3, 2020

A dataframe of mine has the type below.
It has four unique entries (three strings and missing)
When trying to save this as BSON/JLD I get an UndeRefError.
Is there any way to keep working with SentinelArray and 'get rid of UndefInitializer'?
Or is there an option for CSV.read to avoid this?

I gather they are looking to make 'undef' work for BSON JuliaIO/BSON.jl#43
Then again, my data has no undef. Only the type has 'undef'...

this works for me (for the time being) :
dfraw.Instanz = convert(Array{Union{Missing,String},1},dfraw.Instanz);

df.Instanz
125845-element SentinelArrays.SentinelArray{String,1,UndefInitializer,Missing,Array{String,1}}:

#when saving to BSON or JLD file I get this error
ERROR: UndefRefError: access to undefined reference

I also mentioned this here
JuliaIO/BSON.jl#3 (comment)

@kafisatz
Copy link
Author

kafisatz commented Aug 3, 2020

MWE

using DataFrames
using CSV
using BSON

df = DataFrame(a=[1,2],b=[missing,"mi"])
fi=mktemp()[1]
CSV.write(fi,df);
dfr = CSV.read(fi,DataFrame,pool=false);
@assert isa(dfr[2],CSV.SentinelArrays.SentinelArray{String,1,UndefInitializer,Missing,Array{String,1}})

isfile(fi)&&rm(fi)
BSON.@save fi dfr

@quinnj
Copy link
Member

quinnj commented Aug 15, 2020

By default, any "reference" type will use #undef as the sentinel value underneath, so SentinelArray([missing, "hey"]) will show that the array type is Union{Missing, String}, but it's actually using #undef under the hood to signal missing. Seems like you already found a work-around for the BSON.jl issue of not being able to handle #undef?

@kafisatz
Copy link
Author

No. I do not have a workaround.
More interestingly the error also happens when there are no missing entries. It seems the error is related to the column type rather than the occurrence of missing entries.

@quinnj
Copy link
Member

quinnj commented Aug 18, 2020

Yes, that is interesting.

I was confused when you said:

this works for me (for the time being) :
dfraw.Instanz = convert(Array{Union{Missing,String},1},dfraw.Instanz);

does that not work as a workaround?

@kafisatz
Copy link
Author

kafisatz commented Oct 1, 2020

Yes it does work.

@kafisatz kafisatz closed this as completed Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants