Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In a physical AIP, were representations ever considered as being the root? #77

Closed
kieranjol opened this issue Jan 11, 2023 · 10 comments
Closed

Comments

@kieranjol
Copy link

Hi,

I think the E-ARK specs are really wonderful. One thing I'm trying to get my head around is that the root of the E-ARK AIP is the Intellectual Entity. I see why this makes sense from a logical sense, and even from a physical sense. However, I think that there is a potential valid reason for having a representation as the root.
for example:

  • An archive receives a DCP of a film work. This is representation 1. A year later, they receive another representation in the form of a DCDM. This would require a new version of the AIP to accommodate both representations, and the updating of metadata for the package as a whole.
  • If the representations could be stored separately and be their own root, contain their own descriptive metadata, they could exist independently of each other and there would be less intervention required to add new representations as there would be less metadata to update.
  • The cons here would be a duplication of metadata if descriptive metadata is stored in each representation, and if descriptive metadata is updated upstream, it would then have to be updated in two representation packages. Also, the first representation would still probably need to be updated to reflect that a sibling representation exists.

So I suppose my question is, was it ever considered that a representation could be a root, and were there other reasons other than the ones I've listed that might have accounted for the physical AIP always having an IE at its root? The more i think about it, having the physical AIP reflecting the logical AIP makes a lot of sense, and it maps to the PREMIS object types quite well. My main concern is that some archives may rely on basic tape storage which don't allow for ease of updating packages, so E-ARK might be out of reach as a result.

Best,

Kieran O'Leary
National Library of Ireland

@kieranjol kieranjol changed the title In a physical AIP, can representations exist independently of each other? In a physical AIP, were representations ever considered as being the root? Jan 11, 2023
@jmaferreira
Copy link
Contributor

This is somewhat related to the SIP/AIP UPDATE procedure described at #76

@kieranjol
Copy link
Author

I agree! Thanks for linking that, I hadn't seen it.
The more I looked at fig 4 here: https://earkaip.dilcis.eu/#fig4
fig_4_mets_root

It sort of reflects my use case, but I see here that the descriptive metadata is most likely duplicated across both representations, and would need to be mutually updated if required.

My question still holds about if representations were ever seen as being a potential root for an E-ARK AIP, as I think it could lower the barrier for entry with adopting the spec. I will say that I increasingly think that having the IE as the root still makes the most sense, assuming one has the infrastructure to support this.

@jmaferreira
Copy link
Contributor

Hi,

Your statement is only partially true. The descriptive metadata is expected to be placed at the root level (under the metadata/descriptive) folder. This way, it doesn't matter if you have 1 or 100 representations. Their intellectual description will always be the same.

The use of descriptive metadata at the representation level is supposed to be exceptional. It is there to be able to handle exceptions such as a particular representation including some additional descriptive elements that may aid in the finding of that object.

@kieranjol
Copy link
Author

This makes much more sense now, thank you! I'm happy to have this closed anyhow.

@kieranjol
Copy link
Author

kieranjol commented Jan 11, 2023

Actually, is your clarifying statement explicitly in the spec, or is it implicit?
I can't find it here anyhow but I may be looking in the wrong place: https://earkaip.dilcis.eu/#compoundvs.dividedpackagestructure

Is it worth adding a clarifying statement to the spec?
Edit: I couldn't find a reference in the Common Spec either.

@jmaferreira
Copy link
Contributor

@shsdev @karinbredenberg Could you please check if the spec makes it clear to everyone how descriptive metadata at the representation level is expected to be used?

@shsdev
Copy link
Collaborator

shsdev commented Jan 13, 2023

Dear Kieran,

indeed it is not specified what type of metadata you put on the root or representation level. The possibility is there to allow storing metadata specific to the representation. We haven't specified what that means either.

However, generally, the descriptive metadata in the root relates to the intellectual entity. I think it is wise to keep this metadata always up to date independent of other descriptive metadata at the representation level. The reason is that the descriptive metadata in the root is always the first thing to look into.

How metadata is maintained in the root depends on how the AIP is managed as a package, i.e., if it is possible to update the root metadata. There is the possibility to do a backup of the descriptive metadata each time it is edited or do any other kind of versioning. Anyhow, as said, it should always be kept up to date.

The structure of the AIP is meant to receive updates in form of new representations. However, the root METS has a pointer to representation METS, so this one should be updated anyhow. Further, the PREMIS events should document why the new representation was created (you said that you may receive a new representation, but this could also be a decision for preservation purposes). The PREMIS event metadata should also document which representation was used to derive the new representation. So there is a few structural and preservation metadata anyhow which need to be changed if a new representation is added.

Coming back to the question what type of metadata should be stored at the representation level: I think it should not be descriptive metadata which relate to the intellectual entity as a whole because this metadata would intuitively be expected at the root level. The folder is rather for technical and preservation metadata which relate to the representation.

Best wishes,

Sven

@kieranjol
Copy link
Author

Apologies for the delay, this is super clear.
I still think that more clarification is required in the specification regarding figure 4. Perhaps removing 'descriptive' from the metadata hierarchy in representations could be warranted, or a clarifying remark about how the key descriptive metadata needs to be at the root IE level, and any additional descriptive metadata for the representation should only go in the representation descriptive metadata folder.
I'm trying to think of examples where extra descriptive metadata would be required at the representation level, as you've covered how so much contextual metadata around the creation/acquisition of the new representation could be covered by technical metadata or PREMIS events.
If a representation requires extra descriptive metadata, is it even the same work at that point, and to use PREMIS terms, would it be deviating from the 1:1 principle where this representation is actually part of a different Intellectual Entity?

@shsdev
Copy link
Collaborator

shsdev commented Jan 20, 2023

You are right, I think it would be good to clarify this in the specification and also maybe remove the descriptive metadata at least from the example figure. I am not sure descriptive metadata would be needed at the representation level at all. And yes, I think it is important to be careful about changes which actually lead to a different intellectual entity. Representations are only for changes regarding the form of how the data is persisted and presented, such as storing a file in a different format. Changing metadata does not change the content files, so it would not be an issue in that regard. However, if the metadata is used for interpretation in a scientific context, this might lead to confusion and misunderstandings.

shsdev added a commit to shsdev/E-ARK-AIP that referenced this issue Jan 20, 2023
Updated figure (Removed descriptive metadata folder from representation level), see DILCISBoard#77
shsdev added a commit to shsdev/E-ARK-AIP that referenced this issue Jan 20, 2023
Remove metadata folder from representation level, see DILCISBoard#77
shsdev added a commit to shsdev/E-ARK-AIP that referenced this issue Jan 20, 2023
There should be no descriptive metadata folder on the representation level, see DILCISBoard#77
@shsdev
Copy link
Collaborator

shsdev commented Jan 31, 2023

Issue is addressed in pull request https://github.com/DILCISBoard/E-ARK-AIP/pulls, changes will go into AIP specification release v2.1.1.

@shsdev shsdev closed this as completed Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants