Could the IPT take DwCA that are updated online? #2608

ManonGros · 2024-12-03T09:48:19Z

The Specify platforms can generate Darwin Core Archives and makes them available online. These archives can be updated automatically.
Usually, we ask publishers to register the Archive endpoints directly (or helpdesk does it for them).

However, the metadata associated with these datasets isn't always what the publishers would like to share on GBIF. @spalp Asked if it would be possible to have the Darwin core archives content in the IPT but have the EML from the IPT.

mike-podolskiy90 · 2024-12-03T09:57:29Z

Thanks Marie
cc @spalp

spalp · 2024-12-03T10:53:37Z

Thank @marie. My question was whether some of the metadata fields could be taken as they are from the Archive, but others are automatically updated/added when ingested. The fields that are almost certain to change between archive versions are:

Coverage: temporalCoverage, geographicCoverage and temporalCoverage,
Additional metadata: dateSamp, citation

I just talked with @mike-podolskiy90 and he ensured me that even if we fetch a complete DwC from a URL, together with its EML, the option to automatically infer Geographic, Temporal and Taxonomic scope, if previously selected, should not be affected.

So, the only new feature I would like is the ability to provide a URL to a DwC-A to be regularly monitored and published via the IPT.

mike-podolskiy90 · 2024-12-03T11:44:37Z

Thanks Salza
This sounds like a useful feature. I would like to gather more opinions on how this should be implemented.

mike-podolskiy90 · 2024-12-04T09:52:25Z

My idea here is to create a new source type - DwCA (or URL/DwCA, we can discuss). IPT takes this archive from the provided URL and publishes it. I am not sure if we should unpack the archive and reassemble it with IPT with validation and all. I suppose it should, to ensure the quality of DwCA.

EML will be taken from the archive, but with the ability to automatically infer coverage metadata.

@gbif/dataproducts Andrea and Cecilie, maybe you have something to add here please

mike-podolskiy90 self-assigned this Dec 3, 2024

mike-podolskiy90 added this to the 3.2 milestone Dec 3, 2024

mike-podolskiy90 added the Type-NewFeature label Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could the IPT take DwCA that are updated online? #2608

Could the IPT take DwCA that are updated online? #2608

ManonGros commented Dec 3, 2024

mike-podolskiy90 commented Dec 3, 2024 •

edited

Loading

spalp commented Dec 3, 2024 •

edited

Loading

mike-podolskiy90 commented Dec 3, 2024

mike-podolskiy90 commented Dec 4, 2024

Could the IPT take DwCA that are updated online? #2608

Could the IPT take DwCA that are updated online? #2608

Comments

ManonGros commented Dec 3, 2024

mike-podolskiy90 commented Dec 3, 2024 • edited Loading

spalp commented Dec 3, 2024 • edited Loading

mike-podolskiy90 commented Dec 3, 2024

mike-podolskiy90 commented Dec 4, 2024

mike-podolskiy90 commented Dec 3, 2024 •

edited

Loading

spalp commented Dec 3, 2024 •

edited

Loading