Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify scope of DCAT - Datasets, Distributions, Services? #172

Closed
dr-shorthair opened this issue Mar 22, 2018 · 15 comments
Closed

Clarify scope of DCAT - Datasets, Distributions, Services? #172

dr-shorthair opened this issue Mar 22, 2018 · 15 comments

Comments

@dr-shorthair
Copy link
Contributor

dr-shorthair commented Mar 22, 2018

DCAT version 1 supported cataloging of Datasets. Two of the Use Cases describe the need to also support cataloguing of Services:

There is evidence of use of complicated, indirect approaches to resolve this in existing deployments - see #116 (comment) #166 (comment)

The following issues have discussed aspects of the problem and links between Datasets, Distributions and Distribution services in the context of a DCAT Catalog: #56, #116, #124, #145, #166
Some of these discuss potential solutions before the requirements have been clearly articulated and agreed.

This issue is to clarify the scope of DCAT. When this has been clarified we can then determine the best way to meet the requirements, through creative use or adaptation of the existing vocabularies, or by adding some classes and properties to DCAT, or some other method.

@dr-shorthair
Copy link
Contributor Author

dr-shorthair commented Mar 26, 2018

Possible motions for this week's DCAT meeting:

  1. Proposed: DCAT Catalogs may include data services
    and if this is agreed
  2. Proposed: A class for data services should be added to DCAT backbone, alongside dcat:Dataset and dcat:Catalog

@andrea-perego
Copy link
Contributor

Thanks for drafting this proposal, @dr-shorthair .

I have a couple of comments:

  1. I'm not sure dcat:Catalog should be made a subclass of dctype:Service. Its definition in DCAT 1.0 says "A data catalog is a curated collection of metadata about datasets." So, although it may correspond to a catalogue service (as done in GeoDCAT-AP), it can simply be a "static" collection, a mechanism for grouping datasets based on some criteria. I'm aware of examples of this use - e.g., the list of datasets produced by a project/activity.
  2. The proposal seems to imply that a dcat:Catalog can include records of "data services" only. I agree this is the case for ISO 19115, but I think we should relax such range constraint to allow any type of services to support other use cases. E.g., in DataCite, a service is not necessarily a data service. Moreover, we have examples in Europe of catalogues of online/offline public services (whose metadata follow the Core Public Service vocabulary).
  3. It is not clear to me if the proposal supports the possibility for a dcat:Catalog to include metadata records of other dcat:Catalog's - I mean, not as sub-catalogues. This is supported in ISO 19115.

@dr-shorthair
Copy link
Contributor Author

  1. OK - remove sub-class axiom from dcat:Catalog

  2. That limitation was not the intention. I was just taking an incremental approach: right now we know about Data Services, so we add them; later we may know about other things, so we can add them then.

A challenge in modeling is whether to be parsimonious - only model the things we know about now - else attempt to provide a generic home for things we haven't yet encountered, but have a hunch about. These days I tend to err towards the former, and take good care of what we do know about, and leave the unknowns for the future. I lean towards having well-named predicates so went for dcat-s:dataService to link to a dcat-s:DataService by analogy with dcat:dataset that links to dcat:Dataset. It might be generalized a little, but then we end up at dcat:WebService ... which is already deprecated! (Can you un-deprecate something??)

  1. OK - but again, I think I would add a specific (recursive) property for that. Specialized properties make for more compact queries and paths.

@dr-shorthair
Copy link
Contributor Author

@agbeltran
Copy link
Member

Given @dr-shorthair's representation of the chosen solution (as discussed in last week's call) and available at https://github.com/w3c/dxwg/wiki/Cataloguing-data-services#chosen-solution, I suggest we rename 'Service' to 'DataService' to make it more specific.

@agbeltran
Copy link
Member

Following the discussion on the call today, I will revise the reasons behind calling 'Service' rather than 'DataService'. I also raised that I think it might be too complex to have a typology of services, as it might not be complete. Rather, we could use a generic service class and characterise it with specific attributes.

@dr-shorthair
Copy link
Contributor Author

  1. I proposed just 'Service' to allow for cataloging of other kinds of services (authentication?, entertainment?). We also might have just used dctype:Service but I feel we should have DCAT classes in our backbone;

  2. I also saw the possibility of having no specializations, but DataDistributionService seems like a central concern, and since we also want some links to Datasets and Distributions, it is much easier to axiomatize these with a named class.

@kcoyle
Copy link
Contributor

kcoyle commented May 23, 2018

What happens if one has one or more datasets and related services, but does not define a DCAT catalog as such? As I recall, one can use DCAT to define datasets on their own, not within catalogs. Also, when I look at sites with services (e.g. https://www.ny.gov/services/health) most of the services are not related to data sets. Do you expect that these services could be included in DCAT catalogs?

@makxdekkers
Copy link
Contributor

@kcoyle The relationship between Catalogs and Datasets is already in issue #62. As far as I see it, there is nothing in the specification of DCAT that would stop someone to create and describe a Dataset without having a Catalog.
I see no problems allowing a Catalog to include Services that are not linked to Datasets.

@kcoyle
Copy link
Contributor

kcoyle commented May 23, 2018

@makxdekkers In that case, DCAT expands to service sites that are unrelated to datasets - which may indeed be fine, but could be confusing because of the term "Data" in the name. However, a definition at the beginning of the document could expand the use of "Data" to include information services, and we could emphasize that aspect in other documentation, such as primers, etc.

@dr-shorthair
Copy link
Contributor Author

As with any other RDF vocabulary it is not in our power to control how people use it.

DCAT is designed to be primarily a model for catalogs of datasets, and now also dataservices, and that is what our documentation will describe. But individual classes and predicates in the DCAT namespace might find good use in other applications and I don't see how this could be wrong if it suits a purpose.

@davebrowning
Copy link
Contributor

Since we touched on this point (re:quite what is the scope of the catalogue is in our discussions) on the DCAT call, I wanted to suggest that we need to be clear where we are recommending what we publish in the new version of the DCAT vocab while at the same time recognising there exist other situations where our approach to catalogues might potentially be influential.

We don't want (or have the time/effort) to get tangled up in domains which have introduced catalogues for their own purposes even if we suspect that there is likely a common pattern...

@andrea-perego
Copy link
Contributor

I support the idea of not limiting dcat:Catalog's to data-related resources. On the other hand, I recognise the risk of leaving the door to much open. As @dr-shorthair said, we cannot control how people will use a vocabulary, but at least we can provide guidance on how it should be used.

Thinking about what such guidance could be, an option could be to refer to existing catalogue standards / communities (CSW, OAI-PMH, DataCite), and the different types of resources they support. All these communities are potential users of DCAT (actually, at least in the geo domain, they are already DCAT users), which gives one of the motivations of expanding the scope of DCAT. So, we may say something like: if your resources are of one those types used in such communities, they are in the scope of DCAT. Otherwise, it may be not the case - and here we can give examples of resources that shouldn't be part of a dcat:Catalog (if any),

@dr-shorthair
Copy link
Contributor Author

at least we can provide guidance on how it should be used.

perhaps "at least we can provide guidance on how it can be used for what it was designed for."

@dr-shorthair
Copy link
Contributor Author

I think this can be marked Resolved thanks to #241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants