-
Notifications
You must be signed in to change notification settings - Fork 4
Cookbook
Wiki ▸ Cookbook
We can fetch the list of all datasets updated within the last 24 hours, including each dataset's metadata record and the geoJSON track using a few scripts located under https://github.com/kerfoot/gdutils/tree/master/scripts/dac
Let's fetch the dataset ids for all datasets that have updated within the last 24 hours (default):
> dataset_ids=$(search_datasets.py)
> $ echo $dataset_ids
dataset_id ce_386-20200917T1943 cp_379-20200819T1718 cp_583-20200819T1925 ng222-20200908T1821 ng314-20200806T2040 ru29-20200908T1623
ru33-20201014T1746 ru34-20201003T1821 sam-20201008T0000 SG601-20200906T1631 SG609-20200719T1158 SG610-20200714T1252
SG630-20200719T1206 SG635-20200719T1224 SG649-20200715T1639 SG663-20200715T1659 SG664-20200714T1238 SG665-20200718T1343
SG669-20200717T1211 sp011-20201008T2220 sp013-20200910T1701 sp035-20201008T2226 sp041-20200902T1709 sp047-20200826T1831
sp052-20200630T2154 sp062-20200824T1618 sp064-20200826T1639 sp066-20200824T1631 sylvia-20201015T1425 UW157-20200917T0000
Dataset ids are printed to stdout by default with the header 'dataset_id' as the first record. The script can optionally output the results as JSON or CSV records. Type search_datasets.py -h for a full list of options including using ERDDAP's advanced search to specify min/max time bounds and/or a geographic bounding box.
Once we have the dataset ids, we can loop through each dataset id and fetch the metadata records and the geoJSON track, writing each to a separate file:
> for dataset_id in $dataset_ids
do
get_dataset.py $dataset_id > "${dataset_id}_metadata.json";
get_dataset_track.py $dataset_id > "${datast_id}_track.json";
done