Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table to IFeatureLayer #243

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
fd7eeec
1st draft of an IFeatureLayer constructor from table
mathieu17g Oct 12, 2021
163e9b8
Fixed IFeatureLayer(table) for tables not supporting view on row (e.g…
mathieu17g Oct 12, 2021
9e0e900
Test table to `IFeatureLayer` conversion with a`missing`, mixed geome…
mathieu17g Oct 12, 2021
91e3a10
Fixed handling of type Any in a column and modified error messages
mathieu17g Oct 12, 2021
a79ec71
Fixed conversion to OGRFieldType error handing
mathieu17g Oct 12, 2021
843155d
JuliaFormatter formating
mathieu17g Oct 12, 2021
d8849e4
Handling of `Tables.schema` returning `nothing` or `Schema{names, no…
mathieu17g Oct 12, 2021
a5090a0
Added basic conversion to layer documentation in "Tables interface" s…
mathieu17g Oct 12, 2021
848fe0e
Completed doc on conversion to layer with an example of writing to a …
mathieu17g Oct 12, 2021
0bd76de
Added in doc an example of writing to the GML more capable driver
mathieu17g Oct 12, 2021
2045d55
Fixed a typo in table docs
mathieu17g Oct 12, 2021
a6277af
Added docstring with example to IFeatureLayer(table)
mathieu17g Oct 12, 2021
5035a0e
Formatted with JuliaFormatter
mathieu17g Oct 12, 2021
541ff9b
Fixed a typo on name option for IFeatureLayer(table)
mathieu17g Oct 12, 2021
1b6ab33
Added `nothing` values handling (cf. PR #238):
mathieu17g Oct 12, 2021
7fa434c
`_fromtable` function code readability enhancements:
mathieu17g Oct 12, 2021
8f8b1bd
Fixed a typo in `_fromtable` column types conversion error message
mathieu17g Oct 12, 2021
0e12703
Corrections following @yeesian review
mathieu17g Oct 13, 2021
6454e09
Fixed arg type in `ctv_toWKT` signature ("Table to layer conversion" …
mathieu17g Oct 13, 2021
8f07154
Added GeoInterface geometries handling to table to layer conversion
mathieu17g Oct 16, 2021
a4bd378
Fixed typo in GeoInterface to IGeometry conversion and added tests on it
mathieu17g Oct 16, 2021
7166097
Added tests on GeoInterface geometries types to `OGRwkbGeometryType`
mathieu17g Oct 16, 2021
33c3a76
Formatting and cleaning
mathieu17g Oct 16, 2021
0b1b9d9
Added WKT and WKB parsing options in table to layer conversion
mathieu17g Oct 21, 2021
f8da022
Enhance test coverage by pruning handling of `Vector{String}` convers…
mathieu17g Oct 22, 2021
c0e4da6
Corrected a typo in `_fromtable`
mathieu17g Oct 22, 2021
77d561d
Extended test coverage by testing individually GeoInterface, WKT, WKB…
mathieu17g Oct 22, 2021
a33bc88
Merge branch 'master' into IFeatureLayer_from_table
mathieu17g Oct 28, 2021
b08919b
Folliowing @visr and @yeesian remarks:
mathieu17g Oct 28, 2021
d41cbc2
Fixed test_tables.jl to use `layer_name` kwarg instead of `name` to c…
mathieu17g Oct 28, 2021
1ff0196
Added geometry column specification kwarg `geom_cols`, without any te…
mathieu17g Oct 29, 2021
125a076
Added `fieldtypes` kwarg option to table to layer conversion, without…
mathieu17g Oct 30, 2021
9e90dc1
Added tests on `geomcols` kwarg and corrected issues revealed while a…
mathieu17g Oct 31, 2021
f57a867
A few issue fixes while adding test to `fieldtypes` kwarg option.
mathieu17g Nov 1, 2021
8911988
Added test on `fieldtypes` kwarg and fixed issues raised by those tests
mathieu17g Nov 1, 2021
fbb4618
Added error message test in `@test_throws` tests for `fieldtypes` and…
mathieu17g Nov 2, 2021
1d39170
Merge branch 'master' into IFeatureLayer_from_table
mathieu17g Nov 3, 2021
5cd1f16
Modified conversion from `GeoInterface.AbstracGeometry` to `IGeometry`
mathieu17g Nov 3, 2021
d08a47c
Exclude columns in `fieldtypes` from WKT/WKB parsing in `_infergeomet…
mathieu17g Nov 3, 2021
ad14efd
Typo in the previous commit `spfieldtypes` intead of `fieldtypes`
mathieu17g Nov 3, 2021
0c61b85
Deleted 2 comments in `_fromtable(sch::Tables.Schema{names,types}, ..…
mathieu17g Nov 3, 2021
60df39e
Dropped `fieldtypes` kwarg in src and tests
mathieu17g Nov 27, 2021
a197973
Merge branch 'master' into IFeatureLayer_from_table
mathieu17g Nov 27, 2021
6ae870c
Refixed `ogrerr` macro
mathieu17g Nov 27, 2021
e8560e4
Dropped `geomcols` kwarg and WKT/WKB parsing in src/ and test/
mathieu17g Nov 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 50 additions & 8 deletions docs/src/tables.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,70 @@
# Tabular Interface

```@setup tables
using ArchGDAL, DataFrames
using ArchGDAL; AG = ArchGDAL
using DataFrames
```

ArchGDAL now brings in greater flexibilty in terms of vector data handling via the
[Tables.jl](https://github.com/JuliaData/Tables.jl) API. In general, tables are modelled based on feature layers and support multiple geometries per layer. Namely, the layer(s) of a dataset can be converted to DataFrame(s) to perform miscellaneous spatial operations.

## Conversion to table

Here is a quick example based on the
[`data/point.geojson`](https://github.com/yeesian/ArchGDALDatasets/blob/307f8f0e584a39a050c042849004e6a2bd674f99/data/point.geojson)
dataset:

```@example tables
dataset = ArchGDAL.read("data/point.geojson")

DataFrames.DataFrame(ArchGDAL.getlayer(dataset, 0))
```@repl tables
ds = AG.read("data/point.geojson")
DataFrame(AG.getlayer(ds, 0))
```

To illustrate multiple geometries, here is a second example based on the
[`data/multi_geom.csv`](https://github.com/yeesian/ArchGDALDatasets/blob/master/data/multi_geom.csv)
dataset:

```@example tables
dataset1 = ArchGDAL.read("data/multi_geom.csv", options = ["GEOM_POSSIBLE_NAMES=point,linestring", "KEEP_GEOM_COLUMNS=NO"])
```@repl tables
ds = AG.read("data/multi_geom.csv", options = ["GEOM_POSSIBLE_NAMES=point,linestring", "KEEP_GEOM_COLUMNS=NO"])
DataFrame(AG.getlayer(ds, 0))
```
## Conversion to layer
A table-like source implementing Tables.jl interface can be converted to a layer, provided that:
- Source must contains at least one geometry column
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
- Geometry columns are recognized by their element type being a subtype of `Union{IGeometry, Nothing, Missing}`
- Non geometry columns must contain types handled by GDAL/OGR (e.g. not `Int128` nor composite type)

**Note**: As geometries and fields are stored separately in GDAL features, the backward conversion of the layer won't have the same column ordering. Geometry columns will be the first columns.

DataFrames.DataFrame(ArchGDAL.getlayer(dataset1, 0))
```@repl tables
df = DataFrame([
:point => [AG.createpoint(30, 10), missing],
:mixedgeom => [AG.createpoint(5, 10), AG.createlinestring([(30.0, 10.0), (10.0, 30.0)])],
:id => ["5.1", "5.2"],
:zoom => [1.0, 2],
:location => [missing, "New Delhi"],
])
layer = AG.IFeatureLayer(df)
```

The layer, converted from a source implementing the Tables.jl interface, will be in a memory dataset.
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
Hence you can:
- Add other layers to it
- Copy it to a dataset with another driver
- Write it to a file
### Example of writing with ESRI Shapefile driver
```@repl tables
ds = AG.write(layer.ownedby, "test.shp", driver=AG.getdriver("ESRI Shapefile"))
DataFrame(AG.getlayer(AG.read("test.shp"), 0))
rm.(["test.shp", "test.shx", "test.dbf"]) # hide
```
As OGR ESRI Shapefile driver
- [does not support multi geometries](https://gdal.org/development/rfc/rfc41_multiple_geometry_fields.html#drivers), the second geometry has been dropped
- does not support nullable fields, the `missing` location has been replaced by `""`
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
### Example of writing with GML driver
Using the GML 3.2.1 more capable driver/format, you can write more information to the file
```@repl tables
ds = AG.write(layer.ownedby, "test.gml", driver=AG.getdriver("GML"), options=["FORMAT=GML3.2"])
DataFrame(AG.getlayer(AG.read("test.gml", options=["EXPOSE_GML_ID=NO"]), 0))
rm.(["test.gml", "test.xsd"]) # hide
```
**Note:** [OGR GML driver](https://gdal.org/drivers/vector/gml.html#open-options) option **EXPOSE\_GML\_ID=NO** avoids to read the `gml_id` field, mandatory in GML 3.x format and automatically created by the OGR GML driver
215 changes: 215 additions & 0 deletions src/tables.jl
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,218 @@ function schema_names(featuredefn::IFeatureDefnView)
)
return (geom_names, field_names, featuredefn, fielddefns)
end

"""
_convert_cleantype_to_AGtype(T)

Converts type `T` into either:
- a `OGRwkbGeometryType` or
- a tuple of `OGRFieldType` and `OGRFieldSubType`

"""
function _convert_cleantype_to_AGtype end
_convert_cleantype_to_AGtype(::Type{IGeometry}) = wkbUnknown
@generated _convert_cleantype_to_AGtype(::Type{IGeometry{U}}) where U = :($U)
@generated _convert_cleantype_to_AGtype(T::Type{U}) where U = :(convert(OGRFieldType, T), convert(OGRFieldSubType, T))


"""
_convert_coltype_to_cleantype(T)

Convert a table column type to a "clean" type:
- Unions are flattened
- Missing and Nothing are dropped
- Resulting mixed types are approximated by their tightest common supertype

"""
function _convert_coltype_to_cleantype(T::Type)
flattened_T = Base.uniontypes(T)
clean_flattened_T = filter(t -> t ∉ [Missing, Nothing], flattened_T)
return promote_type(clean_flattened_T...)
end

"""
_fromtable(sch, rows; name)

Converts a row table `rows` with schema `sch` to a layer (optionally named `name`) within a MEMORY dataset

"""
function _fromtable end

"""
_fromtable(sch::Tables.Schema{names,types}, rows; name::String = "")

Handles the case where names and types in `sch` are different from `nothing`

# Implementation
1. convert `rows`'s column types given in `sch` to either geometry types or field types and subtypes
2. split `rows`'s columns into geometry typed columns and field typed columns
3. create layer named `name` in a MEMORY dataset geomfields and fields types inferred from `rows`'s column types
4. populate layer with `rows` values

"""
function _fromtable(
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
sch::Tables.Schema{names,types},
rows;
name::String = "",
)::IFeatureLayer where {names,types}
# TODO maybe constrain `names`
strnames = string.(sch.names)

# Convert column types to either geometry types or field types and subtypes
AG_types = Vector{Union{OGRwkbGeometryType,Tuple{OGRFieldType,OGRFieldSubType}}}(undef, length(Tables.columnnames(rows)))
for (i, (coltype, colname)) in enumerate(zip(sch.types, strnames))
AG_types[i] = try
(_convert_cleantype_to_AGtype ∘ _convert_coltype_to_cleantype)(coltype)
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
catch e
if e isa MethodError
error("Cannot convert column \"$colname\" (type $coltype) to neither IGeometry{::OGRwkbGeometryType} or OGRFieldType and OGRFieldSubType",)
else
rethrow()
end
end
end
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved

# Split names and types: between geometry type columns and field type columns
geomindices = isa.(AG_types, OGRwkbGeometryType)
!any(geomindices) && error("No column convertible to geometry")
geomtypes = AG_types[geomindices] # TODO consider to use a view
geomnames = strnames[geomindices]

fieldindices = isa.(AG_types, Tuple{OGRFieldType,OGRFieldSubType})
fieldtypes = AG_types[fieldindices] # TODO consider to use a view
fieldnames = strnames[fieldindices]

# Create layer
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
layer = createlayer(name = name, geom = first(geomtypes))
# TODO: create setname! for IGeomFieldDefnView. Probably needs first to fix issue #215
# TODO: "Model and handle relationships between GDAL objects systematically"
GDAL.ogr_gfld_setname(
getgeomdefn(layerdefn(layer), 0).ptr,
first(geomnames),
)

# Create FeatureDefn
if length(geomtypes) ≥ 2
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
for (j, geomtype) in enumerate(geomtypes[2:end])
creategeomdefn(geomnames[j+1], geomtype) do geomfielddefn
return addgeomdefn!(layer, geomfielddefn) # TODO check if necessary/interesting to set approx=true
end
end
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
end
for (j, (ft, fst)) in enumerate(fieldtypes)
createfielddefn(fieldnames[j], ft) do fielddefn
setsubtype!(fielddefn, fst)
return addfielddefn!(layer, fielddefn)
end
end
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved

# Populate layer
for (i, row) in enumerate(rows)
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
rowvalues =
[Tables.getcolumn(row, col) for col in Tables.columnnames(row)]
rowgeoms = view(rowvalues, geomindices)
rowfields = view(rowvalues, fieldindices)
addfeature(layer) do feature
# For geometry fields both `missing` and `nothing` map to not geometry set
# since in GDAL <= v"3.3.2", special fields as geometry field cannot be NULL
# cf. `OGRFeature::IsFieldNull( int iField )` implemetation
for (j, val) in enumerate(rowgeoms)
val !== missing &&
val !== nothing &&
setgeom!(feature, j - 1, val)
mathieu17g marked this conversation as resolved.
Show resolved Hide resolved
end
for (j, val) in enumerate(rowfields)
if val === missing
setfieldnull!(feature, j - 1)
elseif val !== nothing
setfield!(feature, j - 1, val)
end
end
end
end

return layer
end

"""
_fromtable(::Tables.Schema{names,nothing}, rows; name::String = "")

Handles the case where types in schema is `nothing`

# Implementation
Tables.Schema types are extracted from `rows`'s columns element types before calling `_fromtable(Tables.Schema(names, types), rows; name = name)`

"""
function _fromtable(
::Tables.Schema{names,nothing},
rows;
name::String = "",
)::IFeatureLayer where {names}
cols = Tables.columns(rows)
types = (eltype(collect(col)) for col in cols)
return _fromtable(Tables.Schema(names, types), rows; name = name)
end

"""
_fromtable(::Tables.Schema{names,nothing}, rows; name::String = "")

Handles the case where schema is `nothing`

# Implementation
Tables.Schema names are extracted from `rows`'s columns names before calling `_fromtable(Tables.Schema(names, types), rows; name = name)`

"""
function _fromtable(::Nothing, rows; name::String = "")::IFeatureLayer
state = iterate(rows)
state === nothing && return IFeatureLayer()
row, _ = state
names = Tables.columnnames(row)
return _fromtable(Tables.Schema(names, nothing), rows; name = name)
end

"""
IFeatureLayer(table; name="")

Construct an IFeatureLayer from a source implementing Tables.jl interface

## Restrictions
- Source must contains at least one geometry column
- Geometry columns are recognized by their element type being a subtype of `Union{IGeometry, Nothing, Missing}`
- Non geometry columns must contain types handled by GDAL/OGR (e.g. not `Int128` nor composite type)

## Returns
An IFeatureLayer within a **MEMORY** driver dataset

## Examples
```jldoctest
julia> using ArchGDAL; AG = ArchGDAL
ArchGDAL

julia> nt = NamedTuple([
:point => [AG.createpoint(30, 10), missing],
:mixedgeom => [AG.createpoint(5, 10), AG.createlinestring([(30.0, 10.0), (10.0, 30.0)])],
:id => ["5.1", "5.2"],
:zoom => [1.0, 2],
:location => [missing, "New Delhi"],
])
(point = Union{Missing, ArchGDAL.IGeometry{ArchGDAL.wkbPoint}}[Geometry: POINT (30 10), missing], mixedgeom = ArchGDAL.IGeometry[Geometry: POINT (5 10), Geometry: LINESTRING (30 10,10 30)], id = ["5.1", "5.2"], zoom = [1.0, 2.0], location = Union{Missing, String}[missing, "New Delhi"])

julia> layer = AG.IFeatureLayer(nt; name="towns")
Layer: towns
Geometry 0 (point): [wkbPoint]
Geometry 1 (mixedgeom): [wkbUnknown]
Field 0 (id): [OFTString], 5.1, 5.2
Field 1 (zoom): [OFTReal], 1.0, 2.0
Field 2 (location): [OFTString], missing, New Delhi
```
"""
function IFeatureLayer(table; name::String = "")::IFeatureLayer
# Check tables interface's conformance
!Tables.istable(table) &&
throw(DomainError(table, "$table has not a Table interface"))
# Extract table data
rows = Tables.rows(table)
schema = Tables.schema(table)
return _fromtable(schema, rows; name = name)
end
Loading