Skip to content

Cookbook

John Kerfoot edited this page Oct 29, 2020 · 3 revisions

WikiCookbook

Contents

Fetch Active Datasets Metadata and Track

We can fetch the list of all datasets updated within the last 24 hours, including each dataset's metadata record and the geoJSON track using a few scripts located under https://github.com/kerfoot/gdutils/tree/master/scripts/dac

Let's fetch the dataset ids for all datasets that have updated within the last 24 hours (default):

> dataset_ids=$(search_datasets.py)
> $ echo $dataset_ids
    dataset_id ce_386-20200917T1943 cp_379-20200819T1718 cp_583-20200819T1925 ng222-20200908T1821 ng314-20200806T2040 ru29-20200908T1623 
    ru33-20201014T1746 ru34-20201003T1821 sam-20201008T0000 SG601-20200906T1631 SG609-20200719T1158 SG610-20200714T1252              
    SG630-20200719T1206 SG635-20200719T1224 SG649-20200715T1639 SG663-20200715T1659 SG664-20200714T1238 SG665-20200718T1343          
    SG669-20200717T1211 sp011-20201008T2220 sp013-20200910T1701 sp035-20201008T2226 sp041-20200902T1709 sp047-20200826T1831          
    sp052-20200630T2154 sp062-20200824T1618 sp064-20200826T1639 sp066-20200824T1631 sylvia-20201015T1425 UW157-20200917T0000

Dataset ids are printed to stdout by default with the header 'dataset_id' as the first record. The script can optionally output the results as JSON or CSV records. Type search_datasets.py -h for a full list of options including using ERDDAP's advanced search to specify min/max time bounds and/or a geographic bounding box.

Once we have the dataset ids, we can loop through each dataset id and fetch the metadata records and the geoJSON track, writing each to a separate file:

> for dataset_id in $dataset_ids
  do
      get_dataset.py $dataset_id > "${dataset_id}_metadata.json";
      get_dataset_track.py $dataset_id > "${datast_id}_track.json";
  done
Clone this wiki locally