Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create default metadata profile #896

Closed
dannylamb opened this issue Aug 20, 2018 · 16 comments
Closed

Create default metadata profile #896

dannylamb opened this issue Aug 20, 2018 · 16 comments
Assignees

Comments

@dannylamb
Copy link
Contributor

See https://docs.google.com/spreadsheets/d/18u2qFJ014IIxlVpM3JXfDEFccwBZcoFsjbBGpvL0jJI/edit#gid=0

The 'Repository Item' content type is pretty bare bones. Let's add to it using the spreadsheet above as guidance. This means adding fields for what's listed (either as strings, entities, or taxonomy term references), and map to RDF as best you can (I'm sure there'll be discussions, we can talk it out using this issue).

After configuring the metadata profile for the 'Repository Item' content type, export its field and field storage definitions into the islandora_demo_feature feature. You'll also want to re-export the RDF mapping with any changes you've made.

@rosiel
Copy link
Member

rosiel commented Aug 27, 2018

We're using the sandbox described in https://groups.google.com/forum/#!topic/islandora/Me8J0aXhjjw to build something up. I had created a bunch of vocabularies, then exported the config, before a vagrant destroy.

For three vocabularies, I imported the following files:

core.entity_form_display.taxonomy_term.dcmi_type_vocabulary.default.yml core.entity_form_display.taxonomy_term.iana_mime_types.default.yml core.entity_form_display.taxonomy_term.lcsh_library_of_congress_subject.default.yml core.entity_form_display.taxonomy_term.marc_resource_types_scheme.default.yml core.entity_view_display.taxonomy_term.dcmi_type_vocabulary.default.yml core.entity_view_display.taxonomy_term.iana_mime_types.default.yml core.entity_view_display.taxonomy_term.lcsh_library_of_congress_subject.default.yml core.entity_view_display.taxonomy_term.marc_resource_types_scheme.default.yml field.field.taxonomy_term.dcmi_type_vocabulary.field_comment.yml field.field.taxonomy_term.dcmi_type_vocabulary.field_externally_managed_uri.yml field.field.taxonomy_term.dcmi_type_vocabulary.field_vocabulary_encoding_scheme.yml field.field.taxonomy_term.iana_mime_types.field_vocabulary_encoding_scheme.yml field.field.taxonomy_term.lcsh_library_of_congress_subject.field_externally_managed_uri.yml field.field.taxonomy_term.lcsh_library_of_congress_subject.field_vocabulary_encoding_scheme.yml field.field.taxonomy_term.marc_resource_types_scheme.field_externally_managed_uri.yml field.storage.taxonomy_term.field_comment.yml field.storage.taxonomy_term.field_external_uri.yml field.storage.taxonomy_term.field_externally_managed_uri.yml field.storage.taxonomy_term.field_vocabulary_encoding_scheme.yml taxonomy.vocabulary.dcmi_type_vocabulary.yml taxonomy.vocabulary.iana_mime_types.yml taxonomy.vocabulary.lcsh_library_of_congress_subject.yml taxonomy.vocabulary.marc_resource_types_scheme.yml

And that doesn't even attach them to the repository object... or fill the vocabs with values.

I have some csv's of values to go in to these, scraping or cherry-picking from existing vocabularies. Not sure what to do with them...

@seth-shaw-unlv
Copy link
Contributor

@rosiel

  1. do you have a copy of those config files you can share? Or is there a PR forthcoming?

  2. It looks like you have subjects and types as taxonomy terms are all implemented as taxonomy terms. The existing controlled_access_terms implements subjects as normal content types. Should we update it to use the fields you describe here?

@rtilla1
Copy link

rtilla1 commented Aug 27, 2018

I believe I have added the small number of field names in Drupal and mapped them to the correct Typein that sandbox. @rosiel had already added the few that are controlled/reconcile, and I didn't touch those. If we are able to give you a list of LCSH (with or without their URIs), or some other list that would need to be included in one of the taxonomies above, can someone spend some time tomorrow morning showing us how to import a CSV to those? Also, how to import a CSV of metadata objects into CLAW/what format the CSV should be in?

@seth-shaw-unlv
Copy link
Contributor

Oh! I didn't see the sandbox there. Okay.

So, to import a CSV of taxonomy terms we need to define a migration config. We can talk about it during tomorrow's "XML2CSV to Islandora CLAW" call. If you want to send me a CSV I can prep for it.

@rosiel
Copy link
Member

rosiel commented Aug 28, 2018

Thanks @rtilla1 ! and @seth-shaw-unlv , i didn't mean for this to override your modelling, i just wanted to explore the "Vocabulary Encoding Schemes" in the DCMI document [1]. It seems to be how we're supposed to model mimetypes - as classes, not strings. I based it off of the third image in this document [2] which has a custom-minted "value URI" for a string "Biology"@en which comes from a (custom) vocabulary. In our case, the dc:format's value URI will be the custom minted entity-thing representing a mimetype (e.g. application/pdf) but that entity thing will have a rdf:value "application/pdf" and a dcam:memberOf http://purl.org/dc/terms/IMT.

Anyway, saw the other values of "Vocabulary Encoding Schemes" included DCMI type so thought i'd map that out to see if we like it... the "MARC Resource Types Scheme" vocabulary seems like a better fit with less data loss, so we might use it instead. But unlike mimetypes, the Marc Resource Types Schemes have URIs for the values so we might be able to model those more straightforwardly.

Thing with mimetypes, is that there are wayyyyyy too many for a dropdown list so i wrote in the description of the vocab "This is a subset add if needed" - which parallels what we're doing with LCSH/subjects. Hence I made a trial vocab for LCSH subjects. Do we want to model parent/child relations of subject terms? It's built in to the taxonomy hierarchy in Drupal. The problem is: LCSH terms often have multiple parents, so the analogy breaks down.

[1] http://dublincore.org/documents/dcmi-terms/#section-4
[2] http://dublincore.org/documents/dc-rdf/

@rtilla1
Copy link

rtilla1 commented Aug 28, 2018

I believe I have added the small number of field names in Drupal and mapped them to the correct Typein that sandbox. @rosiel had already added the few that are controlled/reconcile, and I didn't touch those. If we are able to give you a list of LCSH (with or without their URIs), or some other list that would need to be included in one of the taxonomies above, can someone spend some time tomorrow morning showing us how to import a CSV to those? Also, how to import a CSV of metadata objects into CLAW/what format the CSV should be in?

@seth-shaw-unlv
Copy link
Contributor

seth-shaw-unlv commented Aug 28, 2018

@rtilla1 It doesn't really matter what structure the CSV is, since you can update the migration to match (also, we haven't defined one for the new demo content model, so you tell us! 😁 ). For terms, all you really need is the LCSH term. You don't even really need a CSV.

For example, I have a proof-of-concept module that defines a new content model and configures a migration based on some MADS RDFXML records and a CSV of object records. One of those migrations looks for all the Topic records and creates "subject" nodes. The image metadata objects are migrated from a CSV. The columns of the CSV are defined as part of the source section and their mappings as part of the process section. This example uses the term value to look up the subject node, but it could be changed to use the URI fairly easily if that is what you have.

I may come back and plop some of the code bits directly into this comment later.

@seth-shaw-unlv
Copy link
Contributor

@rosiel and @rtilla1, on the Sandbox we have two taxonomies for types (MARC types and DCMI types) with separate repository object fields for each. How would you feel about the following changes?

  1. Create a single type field that allows values from either/both taxonomies?
  2. Allow multiple types to be selected?
  3. Create a single taxonomy containing both types?

@seth-shaw-unlv
Copy link
Contributor

Based on the migration sprint wrap-up call, it looks like Yes to the first two and "maybe" for the last one. We will have to try it out. Also, @whikloj suggested using a view for the auto-complete to support disambiguation. We'll see how it goes.

@seth-shaw-unlv
Copy link
Contributor

In case anyone is interested, the current WIP branch I'm using is @ https://github.com/seth-shaw-unlv/islandora_demo/tree/issue-896.

I think it is almost ready for a PR, although it requires PR Islandora/controlled_access_terms#8.

@Natkeeran
Copy link
Contributor

@seth-shaw-unlv Can you please push this as PR. I can test this.

@seth-shaw-unlv
Copy link
Contributor

@Natkeeran see https://github.com/Islandora-CLAW/islandora_demo/pull/2

@Natkeeran
Copy link
Contributor

@dannylamb @seth-shaw-unlv

I think attaching fields directly to the Repository Item object as a default would work. The PR is close.

We do have to consider the larger Application Profile architecture at some point. Not sure if this is the ticket to discuss that.

In 7.x, we had cutomizable forms per content model. A user can select from multiple profiles as well.

In 8.x, how can we handle that? Would it be cloning a content type (Say Repository Item - MODS, Repository Item - DC)?

Or developing the metadata profile into its own bundle or content type, then using the inline form entity insert it. Here, RDF mapping can get tricky.

@seth-shaw-unlv
Copy link
Contributor

@Natkeeran "cloning a content type (Say Repository Item - MODS, Repository Item - DC)" is what we were planning to do here (although probably dropping the "Repository Item - " prefix).

@seth-shaw-unlv
Copy link
Contributor

Any objections to closing this now that https://github.com/Islandora-CLAW/islandora_demo/pull/2 is merged?

@dannylamb
Copy link
Contributor Author

@seth-shaw-unlv feel free to close this. If we have improvements to the default metadata profile to make, we'll do separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants