Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify DRS and rootpath in user configuration by using defaults or lists #1165

Open
bsolino opened this issue Jun 9, 2021 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@bsolino
Copy link
Contributor

bsolino commented Jun 9, 2021

Is your feature request related to a problem? Please describe.
This is not a problem yet, but it's something we are anticipating it may become. Currently we are working on reading datasets natively, and the original idea was to group them on one project, native6. However, in practice many of these datasets have their own folder structure or name convention, and it seems it makes more sense to implement as separate projects (see #494 for discussion on this topic). We anticipate that this policiy could increase the number of datasets enough that it clutters the configuration files. Furthermore, it requires users to keep track of the datasets and update their configuration file accordingly.

However, in practice many users share a configuration, as they work in a few HPC's. We are already anticipating this and handling it by having different DRS's for these machines in the config-developer.yml file, and by providing by default configurations that users can uncomment in config-user.yml.

This process could be simplified if users had only to define once in which machine they are working, so that DRS would be used by default. Any exception that a user would need could be specified separatedly.

E.g, from:

# Site-specific entries: Jasmin
# Uncomment the lines below to locate data on JASMIN
drs:
  CMIP6: BADC
  CMIP5: BADC
  CMIP3: BADC
  CORDEX: BADC
  OBS: BADC
  OBS6: BADC
  obs4mips: BADC
  ana4mips: BADC

to

drs:
# Site-specific entries:
# Uncomment the lines below to locate data on JASMIN
  default: BADC

Exceptions could be handled by using the current format, which also would make it backwards compatible

drs:
  default: DKRZ
  CORDEX: BADC

A similar approach could be used with rootpaths, although it would require defining the routes somewhere else.

An alternative to the "default" field would be to use lists. For example, DKRZ currently has one main roothpath, two main DRS and a few exceptions:

rootpath:
  CMIP6: /mnt/lustre02/work/ik1017/CMIP6/data/CMIP6
  CMIP5: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
  CMIP3: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP3
  CORDEX: /mnt/lustre02/work/ik1017/C3SCORDEX/data/c3s-cordex/output
  OBS: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
  OBS6: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
  obs4mips: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
  ana4mips: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
  native6: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
  RAWOBS: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
drs:
  CMIP6: DKRZ
  CMIP5: DKRZ
  CMIP3: DKRZ
  CORDEX: BADC
  obs4mips: default
  ana4mips: default
  OBS: default
  OBS6: default
  native6: default

Which could be expressed as:

rootpath:

  CMIP6: /mnt/lustre02/work/ik1017/CMIP6/data/CMIP6
  CMIP5: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
  CMIP3: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP3
  CORDEX: /mnt/lustre02/work/ik1017/C3SCORDEX/data/c3s-cordex/output
  [ OBS,  OBS6, obs4mips, ana4mips]:
    /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
  [native6, RAWOBS]:
    /mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
drs:
  [CMIP6, CMIP5, CMIP3]: DKRZ
  CORDEX: BADC
  [obs4mips, ana4mips, OBS, OBS6, native6]: default

Personally I prefer having a default, so updates in the repository configuration files are immediately and seamlessly available to the user. I also find using lists a bit clunky and difficult to read.

Another solution can be found in #795, which also contains a collection of related issues and suggestions for a redesign of the configuration files.

Would you be able to help out?
I currently feel this is very low priority. As I said before, this is currently not an issue and I'm only opening this for discussion and record-keeping as we anticipate it may be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant