Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a Tables interface for features #118

Merged
merged 70 commits into from
Sep 14, 2020
Merged
Show file tree
Hide file tree
Changes from 69 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
a8bda52
Initial Commit(setup)
Sov-trotter Mar 29, 2020
08c3e55
Add Tables dependency
Sov-trotter Mar 30, 2020
e7b0edc
Setup GeoTable, reading and parsing files[broken]
Sov-trotter Mar 30, 2020
5fbda99
Create function for returning a NamedTuple
Sov-trotter Mar 31, 2020
fee1404
Remove extra dependency
Sov-trotter Apr 1, 2020
0448998
Implement Tables interface
Sov-trotter Apr 3, 2020
8f33687
add struct GeoTable
Sov-trotter Apr 6, 2020
59a3f29
work towards a row based Tables interface
visr Apr 12, 2020
fb98c7b
Reset local fork and local repo made changes
Sov-trotter Apr 20, 2020
3746305
Import Tables instead of DataStreams
Sov-trotter Apr 20, 2020
31cad0b
Add Base.iterate test
Sov-trotter Apr 21, 2020
67b4336
Modify tests
Sov-trotter Apr 21, 2020
3f937a1
Merge branch 'tables' into dev-tables
Sov-trotter Aug 10, 2020
82ee8a0
Resolve conflict
Sov-trotter Aug 10, 2020
1484024
Merge pull request #2 from yeesian/master
Sov-trotter Aug 10, 2020
86f588b
Minor Refeactor
Sov-trotter Aug 10, 2020
4f5d9c5
Clean up
Sov-trotter Aug 10, 2020
e6b0e9e
Update test
Sov-trotter Aug 10, 2020
23cc9f1
Schema test for 1.5
Sov-trotter Aug 10, 2020
cf68ff1
Make schema test more concrete
Sov-trotter Aug 11, 2020
78fc8fa
Refactor test dependencies
Sov-trotter Aug 11, 2020
5c35700
Tables dependency. :(
Sov-trotter Aug 11, 2020
3101904
Improve iterate method
Sov-trotter Aug 11, 2020
649c941
Work out extra methods; clean up stuff
Sov-trotter Aug 11, 2020
0c9f982
Pass through the geotable function first
Sov-trotter Aug 11, 2020
ade232f
Minor Refactor
Sov-trotter Aug 12, 2020
1a0537c
Take Geometry Out of the table
Sov-trotter Aug 12, 2020
7774256
(no branch):
Sov-trotter Aug 15, 2020
0730dce
Accept null geometries
Sov-trotter Aug 15, 2020
4888ffe
Merge branch 'master' into dev-tables
Sov-trotter Aug 15, 2020
4bf1f0d
Support mutiple geomtries; rename to geotable to Table
Sov-trotter Aug 16, 2020
385d80e
Add tests
Sov-trotter Aug 16, 2020
edfd3e7
Merge pull request #4 from yeesian/master
Sov-trotter Aug 17, 2020
45196aa
Add docs for tables
Sov-trotter Aug 17, 2020
86a2882
CSV doc
Sov-trotter Aug 17, 2020
3ccdf96
Add sha details
Sov-trotter Aug 17, 2020
7f11a47
Merge pull request #5 from yeesian/master
Sov-trotter Aug 17, 2020
5b297d2
Merge pull request #6 from yeesian/master
Sov-trotter Aug 18, 2020
81975ee
Merge pull request #7 from Sov-trotter/dev-tables
Sov-trotter Aug 18, 2020
9b16b07
Merge branch 'docs-tables' into dev-tables
Sov-trotter Aug 18, 2020
fbdef49
Pass through the Table function
Sov-trotter Aug 18, 2020
c1b7ac4
nit's
Sov-trotter Aug 19, 2020
e9949d4
Refactor Base.iterate
Sov-trotter Aug 22, 2020
91f6489
Remove AG; refactor methods
Sov-trotter Aug 22, 2020
b9d0d6b
inline ngeom
Sov-trotter Aug 22, 2020
a4bf268
Minor refactor
Sov-trotter Aug 23, 2020
803691b
Typos fix
Sov-trotter Aug 23, 2020
1655a18
remove DataStreams from deps
visr Aug 23, 2020
a0450f7
wrap long lines in docs
visr Aug 23, 2020
2156aec
using instead of import, to force explicit override
visr Aug 23, 2020
2aa7499
extend getlayer for Table
visr Aug 23, 2020
c196b3a
support passing field name as symbol
visr Aug 23, 2020
d933f09
allow only creating tables from layers
visr Aug 23, 2020
e84491d
various other fixups
visr Aug 23, 2020
90e148f
Refactor schema to include geometry names/types
Sov-trotter Aug 23, 2020
924df47
Add getindex method
Sov-trotter Aug 23, 2020
cee6cec
Update docs
Sov-trotter Aug 23, 2020
4bd8c5a
Add/update tests
Sov-trotter Aug 23, 2020
bcec31e
Fix scope in tests
Sov-trotter Aug 23, 2020
d245b12
Clean up methods
Sov-trotter Aug 23, 2020
513fe65
Refaactors; pretty code :)
Sov-trotter Aug 23, 2020
fadaa78
nit's
Sov-trotter Aug 26, 2020
07f210c
Fix path in tests
Sov-trotter Aug 26, 2020
eff39c9
Add nextnamedtuple to docs
Sov-trotter Aug 26, 2020
7677ae7
typo fix
Sov-trotter Aug 27, 2020
87eb4f7
Test for null/missing
Sov-trotter Aug 31, 2020
d5c0ed3
Refactor tests
Sov-trotter Sep 1, 2020
be2df2a
nit's
Sov-trotter Sep 1, 2020
3247789
1-based indexing
Sov-trotter Sep 13, 2020
84380a5
32bit fix
Sov-trotter Sep 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,19 @@ desc = "A high level API for GDAL - Geospatial Data Abstraction Library"
version = "0.5.0"

[deps]
DataStreams = "9a8bc11e-79be-5b39-94d7-1ccc349a1a85"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
DiskArrays = "3c3547ce-8d99-4f5e-a174-61eb10b00ae3"
GDAL = "add2ef01-049f-52c4-9ee2-e494f65e021a"
GeoFormatTypes = "68eda718-8dee-11e9-39e7-89f7f65f511f"
GeoInterface = "cf35fbd7-0cd7-5166-be24-54bfbe79505f"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

[compat]
DataStreams = "0.4.2"
DiskArrays = "0.2.4"
GDAL = "1.1.3"
GeoFormatTypes = "0.3"
GeoInterface = "0.4, 0.5"
Tables = "1"
julia = "1.3"

[extras]
Expand Down
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[deps]
ArchGDAL = "c9ce4bd3-c3d5-55b8-8973-c0e20141b8c3"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
DiskArrays = "3c3547ce-8d99-4f5e-a174-61eb10b00ae3"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"

Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ makedocs(
"GDAL Datasets" => "datasets.md",
"Feature Data" => "features.md",
"Raster Data" => "rasters.md",
"Tables Interface" => "tables.md",
"Geometric Operations" => "geometries.md",
"Spatial Projections" => "projections.md",
# "Working with Spatialite" => "spatialite.md",
Expand Down
19 changes: 19 additions & 0 deletions docs/src/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,3 +101,22 @@ dataset = ArchGDAL.<copy/create/read/update>(...)
!!! note

This pattern of using `do`-blocks to manage context plays a big way into the way we handle memory in this package. For details, see the section on Memory Management.

The [`ArchGDAL.read`](@ref) method accepts keyword arguments(`kwargs`) viz. the GDAL [open-options](https://gdal.org/drivers/vector/csv.html#open-options) for reading `.csv` spatial datasets.

Example: In a CSV the data is stored as `String`.

```@example datasets
dataset1 = ArchGDAL.read("data/multi_geom.csv")
layer1 = ArchGDAL.getlayer(dataset1, 0)
```

Well this is weird, the CSV driver recognises our point and linestring geometries as `String`. Now if you have a .csvt file of the same name with the geometry types as `WKT`, they types will be recognized, else, GDAL offers open-options to tweak the read parameters that are passed as `kwargs`.

So for the above CSV, we want the driver to detect our geometries, so according to [open-options](https://gdal.org/drivers/vector/csv.html#open-options) we should use the `"GEOM_POSSIBLE_NAMES=point,linestring"` option. Also we want that the geometry columns should not be kept as regular `String` columns, so we add a `"KEEP_GEOM_COLUMNS=NO"` option too.

```@example datasets
dataset2 = ArchGDAL.read("data/multi_geom.csv", options = ["GEOM_POSSIBLE_NAMES=point,linestring", "KEEP_GEOM_COLUMNS=NO"])

layer2 = ArchGDAL.getlayer(dataset2, 0)
```
6 changes: 6 additions & 0 deletions docs/src/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,12 @@ Pages = ["dataset.jl"]
Modules = [ArchGDAL]
Pages = ["feature.jl", "featuredefn.jl", "featurelayer.jl", "fielddefn.jl", "geometry.jl", "styletable.jl", "context.jl"]
```
## [Tables Interface](@id API-Tables-Interface)

```@autodocs
Modules = [ArchGDAL]
Pages = ["tables.jl"]
```

## [Raster Data](@id API-Raster-Data)

Expand Down
72 changes: 72 additions & 0 deletions docs/src/tables.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Tabular Interface

```@setup tables
using ArchGDAL
using DataFrames
```

ArchGDAL now brings in greater flexibilty in terms of raster data handling via the
[Tables.jl](https://github.com/JuliaData/Tables.jl) API, that aims to provide a fast and
responsive tabular interface to data.

In this section, we revisit the
[`data/point.geojson`](https://github.com/yeesian/ArchGDALDatasets/blob/307f8f0e584a39a050c042849004e6a2bd674f99/data/point.geojson)
dataset.

```@example tables
dataset = ArchGDAL.read("data/point.geojson")
```

Each layer can be represented as a separate Table.

```@example tables
layer = ArchGDAL.getlayer(dataset, 0)
```

The [`ArchGDAL.Table`](@ref) method accepts an `ArchGDAL.FeatureLayer`.
```@example tables
table = ArchGDAL.Table(layer)
```

Individual rows can be retrieved using the `Base.getindex(t::ArchGDAL.Table, idx::Int)` method or simply `table[idx]`.

```@example tables
row = table[1]
```

Layers are retrievable!
One can get back the layer that a Table is made up of.
```@example tables
lyr = ArchGDAL.getlayer(table)
```

The Tables interface also support multiple geometries per layer.

Here, we visit the
[`data/multi_geom.csv`](https://github.com/yeesian/ArchGDALDatasets/blob/master/data/multi_geom.csv)
dataset.

```@example tables
dataset1 = ArchGDAL.read("data/multi_geom.csv", options = ["GEOM_POSSIBLE_NAMES=point,linestring", "KEEP_GEOM_COLUMNS=NO"])

layer = ArchGDAL.getlayer(dataset, 0)
table = ArchGDAL.Table(layer)
```

Exatracting a row from the table, we see that the row/feature is made up of two geometries
viz. `point` and `linestring`.
```@example tables
row = table[1]
```

Finally layers can be converted to DataFrames to perform miscellaneous spatial operations.
```@example tables
df = DataFrame(table)
```
In some cases the `nextfeature` might become a bit tedious to use. In which case the `ArchGDAL.nextnamedtuple()` method comes in handy. Though built upon `nextfeature`, simply calling it, yields the `feature` as a `NamedTuple`. Though one might have to use `ArchGDAL.resetreading!(layer)` method to reset the layer reading to the start.

```@example tables
ArchGDAL.resetreading!(layer)
feat1 = ArchGDAL.nextnamedtuple(layer)
feat2 = ArchGDAL.nextnamedtuple(layer)
```
11 changes: 5 additions & 6 deletions src/ArchGDAL.jl
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
module ArchGDAL

import GDAL, GeoInterface, GeoFormatTypes
import DataStreams: Data
import GeoInterface: coordinates, geotype
import Base: convert

using Dates
using GDAL: GDAL
using GeoFormatTypes: GeoFormatTypes
using GeoInterface: GeoInterface
using Tables: Tables

const GFT = GeoFormatTypes

Expand All @@ -30,7 +29,7 @@ module ArchGDAL
include("context.jl")
include("base/iterators.jl")
include("base/display.jl")
include("datastreams.jl")
include("tables.jl")
include("geointerface.jl")
include("convert.jl")

Expand Down
4 changes: 3 additions & 1 deletion src/dataset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -460,8 +460,10 @@ unsafe_getlayer(dataset::AbstractDataset, i::Integer) =

"""
getlayer(dataset::AbstractDataset, name::AbstractString)
getlayer(table::Table)

Fetch the feature layer corresponding to the given name.
Fetch the feature layer corresponding to the given name. If it is called on a Table, which
supports only one layer, a name is not needed.

The returned layer remains owned by the GDALDataset and should not be deleted by
the application.
Expand Down
55 changes: 0 additions & 55 deletions src/datastreams.jl

This file was deleted.

6 changes: 3 additions & 3 deletions src/ogr/feature.jl
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ getfielddefn(feature::Feature, i::Integer) =
IFieldDefnView(GDAL.ogr_f_getfielddefnref(feature.ptr, i))

"""
findfieldindex(feature::Feature, name::AbstractString)
findfieldindex(feature::Feature, name::Union{AbstractString, Symbol})

Fetch the field index given field name.

Expand All @@ -109,7 +109,7 @@ the field index, or -1 if no matching field is found.
### Remarks
This is a cover for the `OGRFeatureDefn::GetFieldIndex()` method.
"""
findfieldindex(feature::Feature, name::AbstractString) =
findfieldindex(feature::Feature, name::Union{AbstractString, Symbol}) =
GDAL.ogr_f_getfieldindex(feature.ptr, name)

"""
Expand Down Expand Up @@ -381,7 +381,7 @@ function getfield(feature::Feature, i::Integer)
end
end

getfield(feature::Feature, name::AbstractString) =
getfield(feature::Feature, name::Union{AbstractString, Symbol}) =
getfield(feature, findfieldindex(feature, name))

"""
Expand Down
4 changes: 2 additions & 2 deletions src/ogr/featuredefn.jl
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ getfielddefn(featuredefn::IFeatureDefnView, i::Integer) =
IFieldDefnView(GDAL.ogr_fd_getfielddefn(featuredefn.ptr, i))

"""
findfieldindex(featuredefn::AbstractFeatureDefn, name::AbstractString)
findfieldindex(featuredefn::AbstractFeatureDefn, name::Union{AbstractString, Symbol})

Find field by name.

Expand All @@ -101,7 +101,7 @@ the field index, or -1 if no match found.
### Remarks
This uses the OGRFeatureDefn::GetFieldIndex() method.
"""
findfieldindex(featuredefn::AbstractFeatureDefn, name::AbstractString) =
findfieldindex(featuredefn::AbstractFeatureDefn, name::Union{AbstractString, Symbol}) =
GDAL.ogr_fd_getfieldindex(featuredefn.ptr, name)

"""
Expand Down
4 changes: 2 additions & 2 deletions src/ogr/featurelayer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -524,7 +524,7 @@ layerdefn(layer::AbstractFeatureLayer) =
IFeatureDefnView(GDAL.ogr_l_getlayerdefn(layer.ptr))

"""
findfieldindex(layer::AbstractFeatureLayer, field::AbstractString, exactmatch::Bool)
findfieldindex(layer::AbstractFeatureLayer, field::Union{AbstractString, Symbol}, exactmatch::Bool)

Find the index of the field in a layer, or -1 if the field doesn't exist.

Expand All @@ -534,7 +534,7 @@ the layer was created (eg. like `LAUNDER` in the OCI driver).
"""
function findfieldindex(
layer::AbstractFeatureLayer,
field::AbstractString,
field::Union{AbstractString, Symbol},
exactmatch::Bool
)
return GDAL.ogr_l_findfieldindex(layer.ptr, field, exactmatch)
Expand Down
73 changes: 73 additions & 0 deletions src/tables.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
"""
Constructs `Table` out of `FeatureLayer`, where every row is a `Feature` consisting of Geometry and attributes.
```
ArchGDAL.Table(T::Union{IFeatureLayer, FeatureLayer})
```
"""
struct Table{T<:Union{IFeatureLayer, FeatureLayer}}
layer::T
end

getlayer(t::Table) = Base.getfield(t, :layer)

function Tables.schema(layer::AbstractFeatureLayer)
field_names, geom_names, featuredefn, fielddefns = schema_names(layer)
ngeom = ArchGDAL.ngeom(featuredefn)
geomdefns = (ArchGDAL.getgeomdefn(featuredefn, i) for i in 0:ngeom-1)
field_types = (_FIELDTYPE[gettype(fielddefn)] for fielddefn in fielddefns)
geom_types = (IGeometry for i in 1:ngeom)
Tables.Schema((field_names..., geom_names...), (field_types..., geom_types...))
end

Sov-trotter marked this conversation as resolved.
Show resolved Hide resolved
Tables.istable(::Type{<:Table}) = true
Tables.rowaccess(::Type{<:Table}) = true
Tables.rows(t::Table) = t

function Base.iterate(t::Table, st = 0)
layer = getlayer(t)
st >= nfeature(layer) && return nothing
if iszero(st)
resetreading!(layer)
end
return nextnamedtuple(layer), st + 1
end

function Base.getindex(t::Table, idx::Int)
Sov-trotter marked this conversation as resolved.
Show resolved Hide resolved
layer = getlayer(t)
setnextbyindex!(layer, idx-1)
return nextnamedtuple(layer)
end

Base.IteratorSize(::Type{<:Table}) = Base.HasLength()
Base.size(t::Table) = nfeature(getlayer(t))
Base.length(t::Table) = size(t)
Base.IteratorEltype(::Type{<:Table}) = Base.HasEltype()
Base.propertynames(t::Table) = Tables.schema(getlayer(t)).names
Base.getproperty(t::Table, s::Symbol) = [getproperty(row, s) for row in t]

"""
Returns the feature row of a layer as a `NamedTuple`

Calling it iteratively will work similar to `nextfeature` i.e. give the consecutive feature as `NamedTuple`
Sov-trotter marked this conversation as resolved.
Show resolved Hide resolved
"""
function nextnamedtuple(layer::IFeatureLayer)
field_names, geom_names = schema_names(layer)
return nextfeature(layer) do feature
prop = (getfield(feature, name) for name in field_names)
geom = (getgeom(feature, idx-1) for idx in 1:length(geom_names))
NamedTuple{(field_names..., geom_names...)}((prop..., geom...))
end
end

function schema_names(layer::AbstractFeatureLayer)
featuredefn = layerdefn(layer)
fielddefns = (getfielddefn(featuredefn, i) for i in 0:nfield(layer)-1)
field_names = (Symbol(getname(fielddefn)) for fielddefn in fielddefns)
geom_names = (Symbol(getname(getgeomdefn(featuredefn, i-1))) for i in 1:ngeom(layer))
return (field_names, geom_names, featuredefn, fielddefns)
end

Sov-trotter marked this conversation as resolved.
Show resolved Hide resolved
function Base.show(io::IO, t::Table)
println(io, "Table with $(nfeature(getlayer(t))) features")
end
Base.show(io::IO, ::MIME"text/plain", t::Table) = show(io, t)
11 changes: 11 additions & 0 deletions test/remotefiles.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,18 @@ const testdatadir = @__DIR__
REPO_URL = "https://github.com/yeesian/ArchGDALDatasets/blob/master/"

# remote files with SHA-2 256 hash
"""
To add more files, follow the below steps to generate the SHA
```
julia> using SHA
julia> open(filepath/filename) do f
bytes2hex(sha256(f))
end
```
"""
remotefiles = [
("data/multi_geom.csv", "00520017658b66ff21e40cbf553672fa8e280cddae6e7a5d1f8bd36bcd521770"),
("data/missing_testcase.csv", "d49ba446aae9ef334350b64c876b4de652f28595fdecf78bea4e16af4033f7c6"),
("data/point.geojson", "8744593479054a67c784322e0c198bfa880c9388b39a2ddd4c56726944711bd9"),
("data/utmsmall.tif", "f40dae6e8b5e18f3648e9f095e22a0d7027014bb463418d32f732c3756d8c54f"),
("gdalworkshop/world.tif", "b376dc8af62f9894b5050a6a9273ac0763ae2990b556910d35d4a8f4753278bb"),
Expand Down
Loading