Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subset catalog by tag #11

Open
jhamman opened this issue Jan 22, 2020 · 1 comment
Open

Subset catalog by tag #11

jhamman opened this issue Jan 22, 2020 · 1 comment
Assignees

Comments

@jhamman
Copy link
Member

jhamman commented Jan 22, 2020

Currently, most of the data in Pangeo's catalog is in the same cloud region. I expect that, in the future, we'll move to a more distributed storage system where we may want to point to data on multiple clouds/regions. For this reason, it will be convenient to subset data by cloud vendor and region. Perhaps something like this would work:

  cgiar_pet:
    ...
    metadata:
      ...
      tags:
        - ...
        - location: gcp-us-central1  
    driver: zarr
    args:
      ...

Another option would be to rework the catalog nesting to represent this level of per-cloud structure.

On the flask-app side, I think it would be useful to understand what is possible in terms of subsetting and how we can leverage intake search capabilities to make this possible.

@charlesbluca
Copy link
Member

I'll check out if the app (from gcloud's side) is able to discern what region a user is accessing from - if the intention is to provide a user data from the appropriate region, then reworking the catalogs to use region as a top-level filter could let us provide them regional cloud catalogs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants