Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot save array with undefined values #43

Open
DilumAluthge opened this issue Jun 6, 2019 · 8 comments
Open

Cannot save array with undefined values #43

DilumAluthge opened this issue Jun 6, 2019 · 8 comments

Comments

@DilumAluthge
Copy link
Member

DilumAluthge commented Jun 6, 2019

I am opening this issue so we can keep track of our progress in solving this bug.

Currently, BSON is unable to save an array that has one or more undefined values. Here is the minimum working example:

using BSON
x = Array{String, 1}(undef, 5)
x[1] = "a"
x[4] = "d"
println(x)
bson("test.bson", Dict(:x => x))

Expected behavior: BSON saves the data as requested.

Actual behavior: the following error is thrown:

ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] lower at /Users/dilum/.julia/packages/BSON/XPZLD/src/extensions.jl:66 [inlined]
 [2] _lower_recursive(::Array{String,1}, ::IdDict{Any,Any}, ::Array{Any,1}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:62
 [3] (::getfield(BSON, Symbol("##7#11")){IdDict{Any,Any},Array{Any,1}})(::Array{String,1}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:62
 [4] applychildren!(::getfield(BSON, Symbol("##7#11")){IdDict{Any,Any},Array{Any,1}}, ::Dict{Symbol,Any}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/BSON.jl:21
 [5] _lower_recursive(::Dict{Symbol,Array{String,1}}, ::IdDict{Any,Any}, ::Array{Any,1}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:62
 [6] lower_recursive(::Dict{Symbol,Array{String,1}}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:73
 [7] bson(::IOStream, ::Dict{Symbol,Array{String,1}}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:81
 [8] #14 at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:83 [inlined]
 [9] #open#310(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::getfield(BSON, Symbol("##14#15")){Dict{Symbol,Array{String,1}}}, ::String, ::Vararg{String,N} where N) at ./iostream.jl:375
 [10] open at ./iostream.jl:373 [inlined]
 [11] bson(::String, ::Dict{Symbol,Array{String,1}}) at /Users/dilum/.julia/packages/BSON/XPZLD/src/write.jl:83
 [12] top-level scope at none:0
@Codyk12
Copy link

Codyk12 commented Jun 10, 2019

It seems the culprit is the ... operator. More specifically the code
Any[x...]
on line ([1] lower at /Users/.../.julia/packages/BSON/XPZLD/src/extensions.jl:66)

This is where it tries to unpack the values of the array into a new array, but this doesn't work on arrays with '#undef' in them.

This behavior works with a different type of array though:

using BSON
x = Array{Int, 1}(undef, 5)
x[1] = 1
x[4] =3
println(x)
bson("test.bson", Dict(:x => x))

BSON is able to save as expected because x gets initialized with zeros so the ... operator has something to unpack, where as Array{String, 1}(undef, 5) is initialized with #undef, so the ... operator breaks.

This seems like an even more basic issue and I see it as two solutions:

  • Change how UndefInitializer() acts with arrays of type String to generate empty string, or something concrete.
    OR
  • Change the functionality of the ... operator to be able to pass along undef somehow. (This option seems much more generalizable and the right thing to do.

@DilumAluthge
Copy link
Member Author

  • Change the functionality of the ... operator to be able to pass along undef somehow. (This option seems much more generalizable and the right thing to do.

@MikeInnes and I experimented with some code that bypassed the ... splatting step. We still ran into an issue writing and reading the #undef elements. So fixing the splatting won't solve this issue.

Change how UndefInitializer() acts with arrays of type String to generate empty string, or something concrete.

Certainly that would work, but would require a breaking change in the Julia language, and I'm not sure if there is support for that.

@Codyk12
Copy link

Codyk12 commented Jun 10, 2019

Does that mean someone just needs to find a way to read and write #undef elements?
Is that the main fix that needs to happen with this issue?

@DilumAluthge
Copy link
Member Author

Yep!

@Codyk12
Copy link

Codyk12 commented Jun 11, 2019

@DilumAluthge, So if this is fixed you and @MikeInnes can patch the splat operator issue? Because for this whole problem to be solved that will also need to be fixed.

Also have you worked on looking at the underlying issue of reading and writing #undef? I don't want to reinvent the wheel if you guys already have good leads.

@DilumAluthge
Copy link
Member Author

@DilumAluthge, So if this is fixed you and @MikeInnes can patch the splat operator issue? Because for this whole problem to be solved that will also need to be fixed.

Yep! Take a look here at an example of how we can replace the splat operator with a function collect_any. So that patch is already written.

Also have you worked on looking at the underlying issue of reading and writing #undef? I don't want to reinvent the wheel if you guys already have good leads.

You can take a look at Mike's work in #8. In PR #8, Mike wrote nothings in place of #undef. Unfortunately, this ends up giving you the wrong eltype. For example, you'd end up with an Array{Union{Nothing, String}} instead of the desired Array{Nothing}. So I don't think that writing nothings in place of #undef is a good solution.

I think the best solution is going to be to write a unique sentinel in place of the #undef values. The BSON specification actually reserves \x06 for "undefined" values, so I think we should use that. (Technically it's now deprecated, but as Mike pointed out in another thread, it's okay for us to deviate a little from the spec. We are not striving for 100% compatibility with other implementations of BSON.)

@Codyk12
Copy link

Codyk12 commented Jun 14, 2019

#47 Hopefully should take are of these issues. @MikeInnes

@AzamatB
Copy link

AzamatB commented Feb 1, 2020

Bump. Was bitten by this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants