Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection-level Assets #800

Merged
merged 4 commits into from
May 27, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- Several new sections to 'best practices' document.
- Added the ability to define Item properties under Assets (item-spec/item-spec.md)
- Add `proj:shape` and `proj:transform` to the projections extension.
- Collection-level assets extension
- Instructions on how to run check-markdown locally

### Changed
Expand Down
1 change: 1 addition & 0 deletions collection-spec/json-schema/collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
"title": "Reference to a core extension",
"type": "string",
"enum": [
"collection-assets",
"commons",
"checksum",
"datacube",
Expand Down
35 changes: 18 additions & 17 deletions extensions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,23 +44,24 @@ stable for over a year and are used in twenty or more implementations.

An extension can add new fields to STAC entities (content extension), or can add new endpoints or behavior to the API (API extension). Below is a list of content extensions, while API extensions are published in the [STAC API repository](https://github.com/radiantearth/stac-api-spec/tree/master/extensions/).

| Extension Title | Identifier | Field Name Prefix | Scope | Maturity | Description |
| ---------------------------------------------- | ---------------- | ------------------- | ------------------------- | ---------- | ---------------------------------- |
| [Checksum](checksum/README.md) | checksum | checksum | Item, Catalog, Collection | *Proposal* | Provides a way to specify file checksums for assets and links in Items, Catalogs and Collections. |
| [Commons](commons/README.md) | commons | - | Item, Collection | *Proposal* | Provides a way to specify data fields in a collection that are common across the STAC Items in that collection, so that each does not need to repeat all the same information. |
| [Data Cube](datacube/README.md) | datacube | cube | Item, Collection | *Proposal* | Data Cube related metadata, especially to describe their dimensions. |
| [Electro-Optical](eo/README.md) | eo | eo | Item | *Pilot* | Covers electro-optical data that represents a snapshot of the earth for a single date and time. It could consist of multiple spectral bands, for example visible bands, infrared bands, red edge bands and panchromatic bands. The extension provides common fields like bands, cloud cover, gsd and more. |
| [Item Asset Definition](item-assets/README.md) | item-assets | - | Collection | *Proposal* | Provides a way to specify details about what assets may be found in Items belonging to a collection. |
| [Label](label/README.md) | label | label | Item | *Proposal* | Items that relate labeled AOIs with source imagery |
| [Point Cloud](pointcloud/README.md) | pointcloud | pc | Item | *Proposal* | Provides a way to describe point cloud datasets. The point clouds can come from either active or passive sensors, and data is frequently acquired using tools such as LiDAR or coincidence-matched imagery. |
| [Projection](projection/README.md) | projection | proj | Item | *Proposal* | Provides a way to describe items whose assets are in a geospatial projection. |
| [SAR](sar/README.md) | sar | sar | Item | *Proposal* | Covers synthetic-aperture radar data that represents a snapshot of the earth for a single date and time. |
| [Satellite](sat/README.md) | sat | sat | Item | *Proposal* | Satellite related metadata for data collected from satellites. |
| [Scientific](scientific/README.md) | scientific | sci | Item, Collection | *Proposal* | Scientific metadata is considered to be data that indicate from which publication data originates and how the data itself should be cited or referenced. |
| [Single File STAC](single-file-stac/README.md) | single-file-stac | - | ItemCollection | *Proposal* | An extension to provide a set of Collections and Items as a single file catalog. |
| [Tiled Assets](tiled-assets/README.md) | tiled-assets | tiles | Item, Catalog, Collection | *Proposal* | Allows to specify numerous assets using asset templates via tile matrices and dimensions. |
| [Versioning Indicators](version/README.md) | version | - | Item, Collection | *Proposal* | Provides fields and link relation types to provide a version and indicate deprecation. |
| [View Geometry](view/README.md) | view | view | Item | *Proposal* | View Geometry adds metadata related to angles of sensors and other radiance angles that affect the view of resulting data |
| Extension Title | Identifier | Field Name Prefix | Scope | Maturity | Description |
| ------------------------------------------------ | ----------------- | ------------------- | ------------------------- | ---------- | ----------- |
| [Checksum](checksum/README.md) | checksum | checksum | Item, Catalog, Collection | *Proposal* | Provides a way to specify file checksums for assets and links in Items, Catalogs and Collections. |
| [Collection Assets](collection-assets/README.md) | collection-assets | - | Collection | *Proposal* | Provides a way to specify assets available on the collection-level. |
| [Commons](commons/README.md) | commons | - | Item, Collection | *Proposal* | Provides a way to specify data fields in a collection that are common across the STAC Items in that collection, so that each does not need to repeat all the same information. |
| [Data Cube](datacube/README.md) | datacube | cube | Item, Collection | *Proposal* | Data Cube related metadata, especially to describe their dimensions. |
| [Electro-Optical](eo/README.md) | eo | eo | Item | *Pilot* | Covers electro-optical data that represents a snapshot of the earth for a single date and time. It could consist of multiple spectral bands, for example visible bands, infrared bands, red edge bands and panchromatic bands. The extension provides common fields like bands, cloud cover, gsd and more. |
| [Item Asset Definition](item-assets/README.md) | item-assets | - | Collection | *Proposal* | Provides a way to specify details about what assets may be found in Items belonging to a collection. |
| [Label](label/README.md) | label | label | Item | *Proposal* | Items that relate labeled AOIs with source imagery |
| [Point Cloud](pointcloud/README.md) | pointcloud | pc | Item | *Proposal* | Provides a way to describe point cloud datasets. The point clouds can come from either active or passive sensors, and data is frequently acquired using tools such as LiDAR or coincidence-matched imagery. |
| [Projection](projection/README.md) | projection | proj | Item | *Proposal* | Provides a way to describe items whose assets are in a geospatial projection. |
| [SAR](sar/README.md) | sar | sar | Item | *Proposal* | Covers synthetic-aperture radar data that represents a snapshot of the earth for a single date and time. |
| [Satellite](sat/README.md) | sat | sat | Item | *Proposal* | Satellite related metadata for data collected from satellites. |
| [Scientific](scientific/README.md) | scientific | sci | Item, Collection | *Proposal* | Scientific metadata is considered to be data that indicate from which publication data originates and how the data itself should be cited or referenced. |
| [Single File STAC](single-file-stac/README.md) | single-file-stac | - | ItemCollection | *Proposal* | An extension to provide a set of Collections and Items as a single file catalog. |
| [Tiled Assets](tiled-assets/README.md) | tiled-assets | tiles | Item, Catalog, Collection | *Proposal* | Allows to specify numerous assets using asset templates via tile matrices and dimensions. |
| [Versioning Indicators](version/README.md) | version | - | Item, Collection | *Proposal* | Provides fields and link relation types to provide a version and indicate deprecation. |
| [View Geometry](view/README.md) | view | view | Item | *Proposal* | View Geometry adds metadata related to angles of sensors and other radiance angles that affect the view of resulting data |

## Third-party / vendor extensions

Expand Down
38 changes: 38 additions & 0 deletions extensions/collection-assets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Collection Assets Extension Specification

- **Title: Collection Assets**
- **Identifier: collection-assets**
- **Field Name Prefix: -**
- **Scope: Collection**
- **Extension [Maturity Classification](../README.md#extension-maturity): Proposal**

A Collection extension to provide a way to specify assets available on the collection-level.

- [Example](examples/example-esm.json)
- [JSON Schema](json-schema/schema.json)

This extension introduces a single new field, `assets` at the top level of a collection.
An Asset Object defined at the Collection level is the same as the [Asset Object in Items](../../item-spec/item-spec.md#asset-object).

Collection-level assets MUST NOT list any files also available in items.
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
If possible, item-level assets are always the preferable way to expose assets.
To list what assets are available in items see the [Item Assets Definition Extension](../item-assets/README.md).

Collection-level assets can be useful in some scenarios, for example:
1. Exposing additional data that applies collection-wide and you don't want to expose it in each Item. This can be collection-level metadata or a thumbnail for visualization purposes.
2. Individual items can't properly be distinguished for some data structures, e.g. [Zarr](https://zarr.readthedocs.io/) as it's a data structure not contained in single files.
3. Exposing assets for "[Standalone Collections](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#standalone-collections)".

## Collection fields

| Field Name | Type | Description |
| ---------- | ---------------------------------------------------------------------- | ----------- |
| assets | Map<string, [Asset Object](../../item-spec/item-spec.md#asset-object)> | **REQUIRED.** Dictionary of asset objects that can be downloaded, each with a unique key. |

**assets**: In general, the keys don't have any meaning and are considered to be non-descriptive unique identifiers.
Providers may assign any meaning to the keys for their respective use cases, but must not expect that clients understand them.
To communicate the purpose of an asset better use the `roles` field in the [Asset Object](../../item-spec/item-spec.md#asset-object).

## Implementations

- The [ESM collection spec](https://github.com/NCAR/esm-collection-spec) uses this extension to expose Zarr archives.
97 changes: 97 additions & 0 deletions extensions/collection-assets/examples/example-esm.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
{
"stac_version": "0.9.0",
"stac_extensions": [
"collection-assets",
"https://github.com/NCAR/esm-collection-spec/tree/v0.2.0/schema.json"
],
"id": "pangeo-cmip6",
"title": "Google CMIP6",
"description": "This is an ESM collection for CMIP6 Zarr data residing in Pangeo's Google Storage.",
"extent": {
"spatial": {
"bbox": [[-180, -90, 180, 90]]
},
"temporal": {
"interval": [["1850-01-15T12:00:00Z", "2014-12-15T12:00:00Z"]]
}
},
"providers": [
{
"name": " World Climate Research Programme",
"roles": ["producer","licensor"],
"url": "https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6"
},
{
"name": "The Pangeo Project",
"roles": ["processor"],
"url": "https://console.cloud.google.com/pangeo.io"
},
{
"name": "Google",
"roles": ["host"],
"url": "https://console.cloud.google.com/marketplace/details/noaa-public/cmip6"
}
],
"license": "proprietary",
"links": [
{
"href": "https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html",
"type": "text/html",
"rel": "license",
"title": "CMIP6: Terms of Use"
}
],
"assets": {
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
"thumbnail": {
"href": "logo.png",
"title": "A preview image for visualization.",
"type": "image/png",
"roles": ["thumbnail"]
},
"catalog": {
"href": "sample-pangeo-cmip6-zarr-stores.csv",
"title": "Catalog",
"description": "Path to a the CSV file with the catalog contents.",
"type": "text/csv",
"roles": ["esm-catalog"],
"esm:column_name": "path"
},
"activity_id": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_activity_id.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "activity_id"
},
"source_id": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_source_id.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "source_id"
},
"institution_id": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_institution_id.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "institution_id"
},
"experiment_id": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_experiment_id.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "experiment_id"
},
"table_id": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_table_id.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "table_id"
},
"grid_label": {
"href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_grid_label.json",
"type": "application/json",
"roles": ["esm-vocabulary"],
"esm:column_name": "grid_label"
}
},
"esm:attributes": ["activity_id", "source_id", "institution_id", "experiment_id", "member_id", "table_id", "variable_id", "grid_label"]
}
22 changes: 22 additions & 0 deletions extensions/collection-assets/json-schema/schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "schema.json#",
"title": "Collection Assets Extension Specification",
"description": "STAC Collection-level assets Extension to a STAC Collection",
"allOf": [
{
"$ref": "../../../collection-spec/json-schema/collection.json"
},
{
"type": "object",
"required": [
"assets"
],
"properties": {
"assets": {
"$ref": "../../../item-spec/json-schema/item.json#/definitions/assets"
jisantuc marked this conversation as resolved.
Show resolved Hide resolved
}
}
}
]
}
15 changes: 9 additions & 6 deletions item-spec/json-schema/item.json
Original file line number Diff line number Diff line change
Expand Up @@ -108,12 +108,7 @@
}
},
"assets": {
"title": "Asset links",
"description": "Links to assets",
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/asset"
}
"$ref": "#/definitions/assets"
},
"properties": {
"$ref": "#/definitions/common_metadata"
Expand Down Expand Up @@ -152,6 +147,14 @@
}
}
},
"assets": {
"title": "Asset links",
"description": "Links to assets",
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/asset"
}
},
"asset": {
"type": "object",
"required": [
Expand Down