Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add geometry editor functions #554

Merged
merged 5 commits into from
Sep 29, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 81 additions & 12 deletions extensions/functions_geometry.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,18 @@ scalar_functions:
value: fp64
return: u!geometry
-
name: "makeline"
name: "make_line"
description: >
Returns a linestring connecting the endpoint of geometry `x` to the begin point of
geometry `y`. Repeated points at the beginning of input geometries are collapsed to a single point.
Returns a linestring connecting the endpoint of geometry `geom1` to the begin point of
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

geometry `geom2`. Repeated points at the beginning of input geometries are collapsed to a single point.

A linestring can be closed or simple. A closed linestring starts and ends on the same
point. A simple linestring does not cross or touch itself.
impls:
- args:
- name: x
- name: geom1
value: u!geometry
- name: y
- name: geom2
value: u!geometry
return: u!geometry
-
Expand All @@ -43,7 +43,7 @@ scalar_functions:
value: u!geometry
return: fp64
-
name: "y_coodinate"
name: "y_coordinate"
description: >
Return the y coordinate of the point. Return null if not available.
impls:
Expand All @@ -52,7 +52,7 @@ scalar_functions:
value: u!geometry
return: fp64
-
name: "numpoints"
name: "num_points"
description: >
Return the number of points in the geometry. The geometry should be an linestring
or circularstring.
Expand All @@ -62,7 +62,7 @@ scalar_functions:
value: u!geometry
return: i64
-
name: "isempty"
name: "is_empty"
description: >
Return true is the geometry is an empty geometry.
impls:
Expand All @@ -71,7 +71,7 @@ scalar_functions:
value: u!geometry
return: boolean
-
name: "isclosed"
name: "is_closed"
description: >
Return true if the geometry's start and end points are the same.
impls:
Expand All @@ -80,7 +80,7 @@ scalar_functions:
value: geometry
return: boolean
-
name: "issimple"
name: "is_simple"
description: >
Return true if the geometry does not self intersect.
impls:
Expand All @@ -89,7 +89,7 @@ scalar_functions:
value: u!geometry
return: boolean
-
name: "isring"
name: "is_ring"
description: >
Return true if the geometry's start and end points are the same and it does not self
intersect.
Expand All @@ -99,7 +99,7 @@ scalar_functions:
value: u!geometry
return: boolean
-
name: "geometrytype"
name: "geometry_type"
description: >
Return the type of geometry as a string.
impls:
Expand All @@ -126,6 +126,7 @@ scalar_functions:
return the largest dimension from the collection. Dimensionality is determined by
the complexity of the input and not the coordinate system being used.

Type dimensions:
POINT - 0
LINE - 1
POLYGON - 2
Expand All @@ -134,3 +135,71 @@ scalar_functions:
- name: geom
value: u!geometry
return: i8
-
name: "is_valid"
description: >
Return true if the input geometry is a valid 2D geometry.

For 3 dimensional and 4 dimensional geometries, the validity is still only tested
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to have this restriction it'd be nice to either have 2d in the name or an option that specifies the number of dimensions to check against.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I also noticed that even though postGIS sets this restriction (https://postgis.net/docs/ST_IsValid.html), some other backends don't specify the dimensionality. For example: https://docs.snowflake.com/en/sql-reference/functions/st_isvalid

@paleolimbot, do you have any thoughts on this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snowflake only supports 2D, so they probably don't need to specify that.

As with all of these PRs, I would urge that Substrait copies PostGIS' (or simple features access) naming + functionality identically. Many of those decisions are a result of years of negotiation and there is a long history of database engineers coming in with a new hot take on how it should be done (e.g., the built-in Postgres geometry types that aren't part of PostGIS and SQLite's geo module support, neither of which are in common use).

That's a long way of saying I agree with what you have in this PR 🙂

in 2 dimensions.
impls:
- args:
- name: geom
value: u!geometry
return: boolean
-
name: "collection_extract"
description: >
Given the input geometry collection, return a homogenous multi-geometry. All geometries
in the multi-geometry will have the same dimension.

If type is not specified, the multi-geometry will only contain geometries of the highest
dimension. If type is specified, the multi-geometry will only contain geometries
of that type. If there are no geometries of the specified type, an empty geometry
is returned. Only points, linestrings, and polygons are supported.

Type numbers:
POINT - 0
LINE - 1
POLYGON - 2
impls:
- args:
- name: geom_collection
value: u!geometry
return: u!geometry
- args:
- name: geom_collection
value: u!geometry
- name: type
value: i8
return: u!geometry
-
name: "flip_coordinates"
description: >
Return a version of the input geometry with the X and Y axis flipped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be another 2D operation. We should probably state what happens to the other dimensions and/or rename the function to clarify its 2D nature.


This operation can be performed on geometries with more than 2 dimensions. However,
only X and Y axis will be flipped.
impls:
- args:
- name: geom_collection
value: u!geometry
return: u!geometry
-
name: "remove_repeated_points"
description: >
Return a version of the input geometry with duplicate consecutive points removed.

If the `tolerance` argument is provided, consecutive points within the tolerance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a 2D point and a 3D point near the zero plane are nearby? Are those eligible to be considered duplicates or does the fact that they have different dimensionalities exclude them from being considered?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems undocumented pretty much everywhere but it seems to be just 2D in the PostGIS implementation ( https://github.com/libgeos/geos/blob/main/src/operation/valid/RepeatedPointRemover.cpp#L50-L56 ). This function isn't in simple features access and was only added to a relatively new version of GEOS (so pretty low priority if it needs to be dropped).

distance of one another are considered to be duplicates.
impls:
- args:
- name: geom
value: u!geometry
return: u!geometry
- args:
- name: geom
value: u!geometry
- name: tolerance
value: fp64
return: u!geometry
Loading