Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASTER, Landsat 8 C2 L2, Sentinel 2 L2A #83

Merged
merged 37 commits into from
Apr 2, 2021
Merged

Conversation

lossyrob
Copy link
Member

This PR is a set of changes across these three datasets. The changes included a lot of cross-project refactoring, and with more time I would have liked to separate out the changes into their own pull requests.

Changes for each dataset:

ASTER

The processing workflow for aster changed from creating as STAC item from HDF and then COGs from the item, to creating COGs from the HDF and then a STAC Item from the HDF XML and COGs.

There was a lot of noisy data being pulled from the HDF metadata, and that is removed. An attempt is made only to include the relevant information into the STAC Items. If additional metadata is needed, the original metadata XML file is an asset and can be further processed.

Reading from the XML data avoid opening and raster headers, which is a preferred approach - if you don't have to open the data, better to skip it and avoid the performance hit!

Sentinel 2

The command for creating Sentinel 2 STAC items included cogification logic that was not fully tested, and is best to later refactor into a workflow where COGification is a pre-cursor step to STAC-ification.

I decided to pull out the "extended item" production from Sentinel 2 STAC creation - there aren't that many s2 properties, and it didn't seem worth the downstream complexity of managing two items. In the future we might even want to pull out some of the s2 added properties if they are encapsulated in metadata aren't super valuable to hold onto in the Item.

Landsat 8 Collection 2 Level 2

Added a "create_item" command to Landsat 8 subpackage. This creates a STAC Item directly from product data, as opposed to converting the 0.7 STAC Items. This includes adding the projection information required by stac-vrt.

Some choices by the USGS STAC that I didn't follow:

  • I didn't separate out the SR and SP as two separate STAC Items. L2SP Scenes that have thermal bands have them represented in the Item, and the SR-only products don't have those assets or bands represented. I added a landsat:processing_level property that matches the MTL metadata with values of L2SP or L2SR that will determine whether or not the thermal assets are available.
  • landsat:wrs_path and landsat:wrs_row use strings like USGS's Item, but left pad with zeros to 3 digits (which matches it's usage in paths)

Some remaining issues:

  • As mentioned in the discussion in Implementation of Landsat STAC handling #23, getting the exact scene footprint is proving challenging. I took @matthewhanson 's existing implementation for parsing the geom out of ANG files. The footprint you get is a lot better than ones based on the MTL coordinates; however, it's still a bit bigger than the footprint in the USGS STAC and what the image shows when I pull it up in QGIS. If anyone knows how to get the footprints in a bit tighter, save vectorizing the footprint from a nodata mask (which would add more processing time I'd love to avoid), please speak up! For now, the ANG is pretty close to the real footprint, and I think serves it's purpose as the Item geometry for most use cases.

@matthewhanson @kylebarron @alexgleith if you'd like to take a look and see what you think, that'd be appreciated. This was a lot of code done quickly, but I tried to be pretty careful and test along the way, so I'm reasonably confident in merging this as an improvement to the existing codebase. I'd like to merge this in in the next week - I'll be using this branch for some ETL jobs in the near term and will update with anything I find.

lossyrob added 16 commits March 25, 2021 23:11
map_opts is a Monad map. Testing generics with python. VSCode
understands the genericized output but does not show an error when the
T in the v parameter does not match the T in the fn parameter.
This commit modifies the aster subpackage to break apart the creation
of the cogs from the creation of the stac items. It also bases the
stac item creation mostly on the XML metadata. It drops a lot of
noisey metadata an generates more pertinent STAC metadata, like
additional assets and proj information.
The extended item was created to avoid too much noisey data in
Sentinel 2 Items; however there's not that much metadata and it
complicates things to have multiple items, so reverting to a single
item for now.
Currently the cogification would attempt to refactor all assets, even
metadata assets. As this functionality needs some work and testing,
remove the option from the command for now.
Set through the "descriptions" property of rasterio
Refactored to core from sentinel2 logic
@lossyrob lossyrob force-pushed the feature/landsat-aster branch from 27990c7 to 95396f1 Compare March 28, 2021 01:24
@codecov-io
Copy link

codecov-io commented Mar 28, 2021

Codecov Report

Merging #83 (94820c8) into master (26e0200) will increase coverage by 1.60%.
The diff coverage is 93.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #83      +/-   ##
==========================================
+ Coverage   83.81%   85.41%   +1.60%     
==========================================
  Files          65       74       +9     
  Lines        1643     2139     +496     
==========================================
+ Hits         1377     1827     +450     
- Misses        266      312      +46     
Impacted Files Coverage Δ
stactools_aster/stactools/aster/__init__.py 100.00% <ø> (ø)
...actools_cgls_lc100/stactools/cgls_lc100/version.py 0.00% <0.00%> (ø)
stactools_sentinel2/stactools/sentinel2/cog.py 0.00% <ø> (-34.15%) ⬇️
stactools_sentinel2/stactools/sentinel2/utils.py 100.00% <ø> (+5.40%) ⬆️
stactools_core/stactools/core/copy.py 78.20% <75.00%> (-0.75%) ⬇️
..._sentinel2/stactools/sentinel2/granule_metadata.py 87.14% <77.77%> (-3.18%) ⬇️
..._sentinel2/stactools/sentinel2/product_metadata.py 91.01% <85.00%> (-4.99%) ⬇️
...tactools_landsat/stactools/landsat/ang_metadata.py 86.84% <86.84%> (ø)
stactools_core/stactools/core/io/xml.py 88.37% <88.37%> (ø)
stactools_aster/stactools/aster/xml_metadata.py 90.00% <90.00%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26e0200...94820c8. Read the comment docs.

@lossyrob lossyrob force-pushed the feature/landsat-aster branch from 5df1c1e to f96ada4 Compare March 28, 2021 04:11
@lossyrob lossyrob force-pushed the feature/landsat-aster branch from f96ada4 to f8533f9 Compare March 28, 2021 04:17
This bug caused SafeManifest to error when working with remote hrefs.
@lossyrob lossyrob force-pushed the feature/landsat-aster branch from b078b4e to eb4cd5a Compare March 29, 2021 09:33
These can be NaN, and can be parsed out of linked metadata if necessary.
Also, ensure no None values are returned.
@lossyrob lossyrob force-pushed the feature/landsat-aster branch from eb4cd5a to 95cb373 Compare March 29, 2021 18:20
- Remove LinkType reference
- move_assets now explicilty sets relative or absolute
- HREF (defaulting to relative, as copying in assets is useful for self contained catalogs)
- Fix landsat convert, which needed to clear hierarchical links in the cloned item
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of documentation lints

stactools_aster/stactools/aster/cog.py Outdated Show resolved Hide resolved
stactools_aster/stactools/aster/cog.py Show resolved Hide resolved
lossyrob and others added 4 commits March 29, 2021 17:19
This symlink goes outside of the repo, so I'm guessing it was an
inadvertant add.
The 0.1.1 version tag was causing non-failing error text during cibuild.
@lossyrob
Copy link
Member Author

@gadomski thanks for taking a look - I made the docstring changes you suggested. I also cherry-picked a couple commits from your threedep branch to get tests passing.

@kylebarron
Copy link
Contributor

  • getting the exact scene footprint is proving challenging

I agree this is a hard problem. In the past I've used WRS-2 geometries because despite having I think larger errors than the ANG file, iirc they're the only method that underestimates the valid geometry? And for mosaicking it can be nice to not overestimate geometries, which could produce missing data seams in the final output. But I don't have a strong opinion on what you should use here.

@matthewhanson
Copy link
Member

See my post in Sentinel-1 for a possible approach to generate the footprints:
#84 (comment)

This would at least be a good option to have in stactools for any data that might not even have a footprint to begin with.

@lossyrob
Copy link
Member Author

lossyrob commented Apr 2, 2021

@kylebarron from what I could tell from putting footprints into qgis, the ANG was an over-representation, i.e. the ANG footprint covered the actual footprint. However I didn't test this exhaustively.

@matthewhanson I agree that stactools should have some footprint extraction logic, as that's pretty doable if not a lot slower. Thanks for adding #85; this could be added and incorporated in a follow up to this PR

@kylebarron
Copy link
Contributor

the ANG was an over-representation

Yes I believe that's correct, though I haven't looked at it extensively. I was just saying that for some use cases it's helpful to have an under-representation so that you don't have seams.

lossyrob added 4 commits April 1, 2021 21:41
This code only handles single-granule products. The
cloud_coverage_assessment removed here is handled by eo:cloud_cover,
taken from the granule. The degraded data percentage is removed at the
product level and added at the granule level.
This information seems really great, especially if we can enable
search over those properties.
Also sprinkle in some type information.
@lossyrob
Copy link
Member Author

lossyrob commented Apr 2, 2021

Ah, thanks for clarifying. I definitely agree.

@lossyrob
Copy link
Member Author

lossyrob commented Apr 2, 2021

Thanks everyone for taking a look - y'all are awesome!

@lossyrob lossyrob merged commit 8815b14 into master Apr 2, 2021
@gadomski gadomski deleted the feature/landsat-aster branch April 29, 2021 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants