Code for creating EarthNet-style minicubes.
This package creates minicubes from cloud storage using STAC catalogues. A minicube usually contains a satellite image time series of Sentinel 2 imagery alongside other complementary information, all re-gridded to a common grid. This package implements a cloud mask based on deep learning, which allows for analysis-ready Sentinel 2 imagery.
It is currently under development, thus do expect bugs and please report them!
The modifications to the code now allow more flexibility in querying and downloading, specifically regarding ERA-5 climate reanalysis data. Whereas previously the package had limited spatial coverage of ERA-5 data, it is now possible to request any region globally.
ERA-5 has a hourly temporal resolution, but is often aggregated to coarser resolutions when combined with other spatio-temporal datasets (e.g. Sentinel-2 with a 5-daily frequency). The code allows for the temporal aggregation of ERA-5 variables to any desired resolution, each according to their statistic (mean, minimum, maximum...). An automatic matching to the Sentinel-2 timeseries is also possible if the two data sources are requested together.
- Loading the code as a package
- Download the repository
# Add the path to the repository
import sys
sys.path.insert(0, '/Absolute_path_to_repo/earthnet-minicuber/')
# Import the module
from earthnet_minicuber.minicuber import *
- Creating a dictionary with specifications of the desired minicube
specs = {
"lon_lat": (43.598946, 3.087414), # center pixel
"xy_shape": (256, 256), # width, height of cutout around center pixel
"resolution": 10, # in meters.. will use this on a local UTM grid..
"time_interval": "2021-07-01/2021-07-31",
"providers": [
{
"name": "s2",
"kwargs": {"bands": ["B02", "B03", "B04", "B8A"], "best_orbit_filter": True, "five_daily_filter": False, "brdf_correction": True, "cloud_mask": True, "aws_bucket": "planetary_computer"}
},
{
"name": "era5",
"kwargs": {"bands": ['sr', 't', 'mint'], "aws_bucket": "planetary_computer", "n_daily_filter": None, "agg_list": ['min', 'max', 'sum'], "match_s2": True}
},
{
"name": "s1",
"kwargs": {"bands": ["vv", "vh"], "speckle_filter": True, "speckle_filter_kwargs": {"type": "lee", "size": 9}, "aws_bucket": "planetary_computer"}
},
{
"name": "ndviclim",
"kwargs": {"bands": ["mean", "std"]}
},
{
"name": "cop",
"kwargs": {}
},
{
"name": "esawc",
"kwargs": {"bands": ["lc"], "aws_bucket": "planetary_computer"}
}
]
}
- Downloading the minicube
mc = emc.load_minicube(specs, compute = True)
- Plotting cloud-masked Sentinel 2 RGB imagery
emc.plot_rgb(mc)
See notebooks/example.ipynb
for a more detailed usage example.
The minicuber is centered around the concept of data providers, which wrap a data source and handle data loading of that source. The emc.Minicuber
class then manages these data providers, by telling them the spatio-temporal range for which data needs to be loaded and afterwards re-gridding all data to a common reference frame (UTM grid).
The Sentinel 2 provider loads and processes Copernicus Sentinel 2 imagery.
Kwargs:
bands
: choose any subset from["B01", "B02", "B03", "B04", "B05", "B06", "B07", "B08", "B8A", "B09", "B11", "B12", "WVP", "AOT", "SCL"]
.aws_bucket
: We currently support data loading from three cloud buckets: Microsoft Planetary Computer ("planetary_computer"
), Element84 AWS bucket (element84
) and DigitalEarthAfrica AWS bucket (dea
). We recommend using the Microsoft planetary computer with the keyword argumentaws_bucket = "planetary_computer"
.best_orbit_filter
: Sentinel 2 has a regular overpass frequency of 5 days. However, sometimes it can be smaller due to off-nadir captures. Such captures change the viewing angle of the scene. IfTrue
, this filter finds the best orbit and then only returns imagery from a regular 5-daily cycle.five_daily_filter
: IfTrue
returns a regular 5-daily cycle starting with the first date infull_time_interval
. It has no effect, ifbest_orbit_filter
is used.brdf_correction
: IfTrue
, does BRDF correction based on the Sentinel 2 Metadata (illumination angles).cloud_mask
: IfTrue
, creates a cloud and cloud shadow mask based on deep learning. It automatically finds the best available cloud mask for the requestedbands
.cloud_mask_rescale_factor
: If using cloud mask and a lower resolution than 10m, set this rescaling factor to the multiple of 10m that you are requesting. E.g. ifresolution = 20
, setcloud_mask_rescale_factor = 2
.correct_processing_baseline
: IfTrue
(default): corrects the shift of +1000 that exists in Sentinel 2 data with processing baseline >= 4.0
The ERA5 provider loads and processes hourly ECMWF climate reanalysis data.
Kwargs:
bands
: choose any subset from["sp", "tp", "sr", "t", "maxt", "mint", "sea_t", "east_wind_10", "east_wind_100", "north_wind_10", "north_wind_100", "ap", "dp"]
.- sp = surface pressure
- tp = total precipitation
- sr = solar radation
- (min/max)t = (min/max) temperature
- sea_t = sea surface temperature
- east/north_wind_10/100 = eastward/northward wind at 10/100 metres
- ap = air pressure at sea level
- dp = dew point temperature
More on the variables here: https://planetarycomputer.microsoft.com/dataset/era5-pds
aws_bucket
: We currently support data loading from two cloud buckets: Microsoft Planetary Computer ("planetary_computer") and AWS bucket ("s3"). Because AWS allows downloading more recent dates, we advise using "s3".n_daily_filter
: Integer. Will aggregate (mean) the data to n-daily, starting form the first date available in the data.agg_list
: List of aggregation functions for each variable among['min', 'max', 'mean', 'median', 'sum']
. The list must be as long as the number of bands, and in the same order as the bands. For example if querying ['t', 'sp', 'sr'] with agg_list = ['min', 'sum', 'mean'] then 't' will be aggregate using 'min' and so forth. If None andn_daily_filter
provided, all variables aggregated with 'mean' by default.match_s2
: If True, match the timestamps to those of Sentinel-2 (5-daily), using as first date the first occurrence of Sentinel-2 data. This will overriden_daily_filter
. All variables aggregated using 'mean' unless provided otherwise withagg_list
. Attention: Sentinel-2 must be provided first in specs in this case.
This package is build on top of stackstac, which allows accessing data stored in cloud-optimized geotiffs with xarray.
Similar to this package, cubo provides a high-level interface to stackstac.
This project is a continuation of the work developed by Requena-Mesa et al. https://github.com/earthnet2021/earthnet-minicuber