-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Breaking] Return missing
if the field is set but null.
#238
Conversation
Fixes #177 . I think this aligns the behavior a lot better across the different file formats (see the test sets in test/test_tables.jl) This behavior is implemented through `getfield(feature, i)` which makes use of `getfieldtype(feature, i)`. That way, we are aligned in behavior for field subtypes even for e.g. displaying the fields. Because a lot hinges on distinguishing between whether a field is set versus whether it is null (see Toblerity/Fiona#460 (comment)), I have also added support for `isfieldnull()`, `isfieldsetandnotnull()`, and `setfieldnull!()`.
missing
if the field is set but null.missing
if the field is set but null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…tries, mixed float/int Test with `nothing` skipped until PR yeesian#238 [Breaking] Return missing if the field is set but null. is merged
As mathieu has observed, `OGRUnsetMarker` and `OGRNullMarkerare` are mutually exclusive. We implement that case to return nothing if it ever comes up, but it is not possible for us to the corresponding code path.
Because getdefault() is meant to be used only when setting fields for notnullable columns with missing values, we make it return `nothing` instead of `missing` to unset the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made some suggestions on the src part and put on hold the test part's review, waiting for your feedback
isfieldsetandnotnull() is just a convenience function.
…fieldtype when reading the field (in the future, we will switch to dispatch on the fieldtype rather than the current dictionary approach, but that is out of the scope of this commit.)
Nice work here! I wasn't aware of unset vs null in GDAL before. I believe null is much more common in practice? I made this GeoJSONSeq file with both unset and null values:
If I run this through ogr with And indeed if I export it from QGIS I get nulls back in field that were unset before:
Probably it makes more sense to keep the distinction here just like |
@visr This is all about displaying a non-tabular (sparse) dataset like your geojson in a tabular interface. In this case, QGIS doesn't have a representation of unset, just NULL. Julia can be more expressive, although I wonder whether our Tables interface likes the mixing of |
Yeah exactly. I can imagine it being less ideal in the Tables interface. Though these functions are also used outside of that, in the sparse setting, where it does make sense to keep the distinction. I think the interface can technically handle both just fine, it is just not very conventional. Though I don't know how common unset is in practice, and it's relatively easy to convert |
@evetion and @visr, in the draft PR #243 , I'm about to try to handle it for a conversion from a table source to an In the other way, since we rely on Tables.jl/src/fallbacks.jl julia> T = Int64; @show T
T = promote_type(T, typeof(nothing)); @show T
T = promote_type(T, typeof(missing)); @show T
T = Int64
T = Union{Nothing, Int64}
T = Union{Missing, Nothing, Int64} |
@mathieu17g thanks for showing that indeed technically it works (#243 is a great addition by the way). I think for the tables interface specifically, the concerns about
Hence for tables I'm less enthousiastic about |
@evetion @visr is this PR good to go from your perspective? I'm okay with iterating further on feedback or for you to have more time to review if you'd like -- else I'll merge it to keep things moving for Mathieu per #238 (comment) |
Yes, and if we need to convert all
This should be easily done in function Tables.rows(x::AbstractFeatureLayer)
cols = Tables.columns(x)
return Tables.RowIterator(cols, Int(Tables.rowcount(cols)))
end
I didn't get what you have in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with merging this. Left two comments.
@mathieu17g we can discuss copies and such in a different issue. I still wanted to create one, because I saw that converting a layer to DataFrame for a large vector file took minutes. |
I'm interested in this too! Feel free to open an issue for it or have the discussion in #243 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thorough! :-)
I'll be merging this PR after #245 is merged if I don't see any further objections by then. |
Seems like #245 is turning out to be more complicated than expected, and we're still figuring it out in JuliaGeo/GDAL.jl#124. In the meantime, there are no remaining issues to resolve here, so I'll merge this PR to unblock #243. |
…tries, mixed float/int Test with `nothing` skipped until PR yeesian#238 [Breaking] Return missing if the field is set but null. is merged
…tries, mixed float/int Test with `nothing` skipped until PR yeesian#238 [Breaking] Return missing if the field is set but null. is merged
- no difference for geometry columns. Both `nothing` and `missing` values map to an UNSET geometry field (null pointer) - field set to NULL for `missing` values and not set for `nothing` values
…tries, mixed float/int Test with `nothing` skipped until PR yeesian#238 [Breaking] Return missing if the field is set but null. is merged
- no difference for geometry columns. Both `nothing` and `missing` values map to an UNSET geometry field (null pointer) - field set to NULL for `missing` values and not set for `nothing` values
Fixes #177 .
I think this aligns the behavior a lot better across the different file formats (see the test sets in test/test_tables.jl)
This behavior is implemented through
getfield(feature, i)
which makes use ofgetfieldtype(feature, i)
. That way, we are aligned in behavior for field subtypes even for e.g. displaying the fields.Because a lot hinges on distinguishing between whether a field is set versus whether it is null (see Toblerity/Fiona#460 (comment)), I have also added support for
isfieldnull()
,isfieldsetandnotnull()
, andsetfieldnull!()
.