Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing metadata.yaml files for existing CMIP5 & CMIP6 catalogues #200

Closed
marc-white opened this issue Sep 25, 2024 · 17 comments · Fixed by #205
Closed

[BUG] Missing metadata.yaml files for existing CMIP5 & CMIP6 catalogues #200

marc-white opened this issue Sep 25, 2024 · 17 comments · Fixed by #205
Assignees
Labels
bug Something isn't working

Comments

@marc-white
Copy link
Collaborator

Describe the bug

During investigations on #197 , it was found that the directory containing the metadata.yaml files for the existing CMIP5 and CMIP6 catalogues (cmip5_al33, cmip5_rr3, cmip6_fs38, cmip6_oi10) has gone missing. The directory referred to in the access_nri_intake_catalog config (/g/data/tm70/intake) no longer exists.

To Reproduce

See /g/data/tm70.

Additional context

The metadata can be recovered from the existing catalog via, e.g.,

import intake
cat = intake.cat.access_nri
cat["cmpi6_fs38"].metadata
@marc-white marc-white added the bug Something isn't working label Sep 25, 2024
@marc-white
Copy link
Collaborator Author

Storing existing metadata here to avoid future loss:

cmip5_al33

{'contact': 'NCI',
 'created': None,
 'description': 'Replicated CMIP5-era datasets catalogued by NCI',
 'email': 'help@nci.org.au',
 'experiment_uuid': '658c95cc-c299-450c-82a1-b2b2308f7c6e',
 'keywords': ['cmip'],
 'license': None,
 'long_description': 'All CMIP5-era replicated data contained under the project al33.  All file versions present are in the listing. Maintained By: NCI Contact: help@nci.org.au References: https://pcmdi.llnl.gov/mips/cmip5/',
 'model': ['CMIP5'],
 'name': 'cmip5_al33',
 'nominal_resolution': [None],
 'notes': 'null',
 'parent_experiment': None,
 'reference': None,
 'related_experiments': [None],
 'url': 'https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f9489_5106_5649_5038',
 'version': None,
 'catalog_dir': ''}

cmip5_rr3

{'contact': 'NCI',
 'created': None,
 'description': 'Australian CMIP5-era datasets catalogued by NCI',
 'email': 'help@nci.org.au',
 'experiment_uuid': '473d0c44-ab66-458c-b32e-1e1774175853',
 'keywords': ['cmip'],
 'license': None,
 'long_description': 'All CMIP5-era Australian published data contained under the project rr3.  All file versions present are in the listing. Maintained By: NCI Contact: help@nci.org.au References: https://pcmdi.llnl.gov/mips/cmip5/',
 'model': ['CMIP5'],
 'name': 'cmip5_rr3',
 'nominal_resolution': [None],
 'notes': 'null',
 'parent_experiment': None,
 'reference': None,
 'related_experiments': [None],
 'url': 'https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f7448_2157_9857_1076',
 'version': None,
 'catalog_dir': ''}

cmip6_fs38

{'contact': 'NCI',
 'created': None,
 'description': 'Australian CMIP6-era datasets catalogued by NCI',
 'email': 'help@nci.org.au',
 'experiment_uuid': 'dfdeb421-5c56-4d58-a0b2-04b717e5cff7',
 'keywords': ['cmip'],
 'license': None,
 'long_description': 'All CMIP6-era Australian published data contained under the project fs38.  All file versions present are in the listing. Maintained By: NCI Contact: help@nci.org.au References: https://pcmdi.llnl.gov/CMIP6/',
 'model': ['CMIP6'],
 'name': 'cmip6_fs38',
 'nominal_resolution': [None],
 'notes': 'null',
 'parent_experiment': None,
 'reference': None,
 'related_experiments': [None],
 'url': 'https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f3154_9976_7262_7595',
 'version': None,
 'catalog_dir': ''}

cmip6_oi10

{'contact': 'NCI',
 'created': None,
 'description': 'Replicated CMIP6-era datasets catalogued by NCI',
 'email': 'help@nci.org.au',
 'experiment_uuid': 'b05038ca-8c78-4ca6-a914-ae33dd9abffe',
 'keywords': ['cmip'],
 'license': None,
 'long_description': 'All CMIP6-era replicated data contained under the project oi10.  All file versions present are in the listing. Maintained By: NCI Contact: help@nci.org.au References: https://pcmdi.llnl.gov/CMIP6/',
 'model': ['CMIP6'],
 'name': 'cmip6_oi10',
 'nominal_resolution': [None],
 'notes': 'null',
 'parent_experiment': None,
 'reference': None,
 'related_experiments': [None],
 'url': 'https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f5194_5909_8003_9216',
 'version': None,
 'catalog_dir': ''}

@marc-white marc-white self-assigned this Sep 25, 2024
@marc-white
Copy link
Collaborator Author

I've created replacement YAML files for these four experiments. I've mostly just copied what I got from the .metadata call on the existing catalog, but I've taken the opportunity to add the available realms for each experiment. @rbeucher and/or @dougiesquire , please review: they're available on gadi under /scratch/tm70/mcw120.

Once we're happy with the YAML files, I'll get them properly placed and do a PR to update the references within the code.

@rbeucher
Copy link
Member

Hi @marc-white

Looks good to me

@marc-white
Copy link
Collaborator Author

I don't have write permission to /g/data/dk92/catalog/v2/esm/ where all of these experiments are kept. Should we store the metadata on /g/data/tm70 again, or (preferred solution) do we know someone who has write access for dk92?

@rbeucher
Copy link
Member

rbeucher commented Sep 30, 2024

We can't store on dk92. However, as the catalog is in xp65, shouldn't we store those yam l files there? Tm70 is internal to access-nri.

@marc-white
Copy link
Collaborator Author

I presumed we wanted to store the metadata.yaml close to the data that it describes, e.g., see the metadata.yaml locations for ACCESS-CM2 in access-nri-intake-catalog/config/access-cm2.yaml.

@rbeucher
Copy link
Member

Yes but for NCI data collections we don't have write access and cannot add anything there.

@marc-white
Copy link
Collaborator Author

Create a new directory under xp65 for them then? E.g., /g/data/xp65/admin/metadata/<experiment>/metadata.yaml?

@rbeucher
Copy link
Member

I see you have created /g/data/xp65/admin/access-nri-intake-catalog
Let's use that

@marc-white
Copy link
Collaborator Author

@rbeucher I think that's you who created that directory, and it seems to be a copy of the access-nri-intake-catalog source code...

@rbeucher
Copy link
Member

:-) I don't remember doing that... :-/
I think the access-nri-intake-catalog/config is a good place.

@marc-white
Copy link
Collaborator Author

Do you want me to blow away the contents of that access-nri-intake-catalog directory on xp65 and go from there?

@rbeucher
Copy link
Member

The build_all.sh script in access-nri-intake-catalog/bin used to get the config files from /g/data/tm70/ds0092/projects/access-nri-intake-catalog/config. We can change it to /g/data/xp65/admin/access-nri-intake-catalog/config

@rbeucher
Copy link
Member

Oh I think I am mistaken... Not sure what were those CONFIGS=( cmip5.yaml cmip6.yaml access-om2.yaml access-cm2.yaml access-esm1-5.yaml )

@rbeucher
Copy link
Member

OK... I get it now. Yes I think your suggestion is good. /g/data/xp65/admin/intake/metadata should work

@marc-white
Copy link
Collaborator Author

@rbeucher those are the YAML files that define where all the data & associated metadata for the contents of the access-nri-intake-catalog live, and they're kept in the access-nri-intake-catalog/config directory of the repository. However, they're not included as package data, so it looks like @dougiesquire had a checkout of the repo at that tm70 so the build_all.sh script could access them.

They're distinct from the per-experiment metadata.yaml files that describe what is contained within each experiment.

@rbeucher
Copy link
Member

Yes sorry, I got confused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants