Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate IA identifier on Edition edit #109

Closed
george08 opened this issue Dec 20, 2011 · 11 comments
Closed

Validate IA identifier on Edition edit #109

george08 opened this issue Dec 20, 2011 · 11 comments
Assignees
Labels
Good First Issue Easy issue. Good for newcomers. [managed] metadata Theme: Identifiers Issues related to ISBN's or other identifiers in metadata. [managed]

Comments

@george08
Copy link

It's important that only real IA identifiers are entered on the Edit Edition form.

  • Need to add AJAX validation on that field - make sure it resolves to an IA item
  • Need to handle the case where the IA ID isn't valid
  • Parse and trim identifiers if people enter full archive.org URLs

We should also accept full URLs as identifiers in that form, I think. For other types of IDs.

@rajbot
Copy link
Contributor

rajbot commented Dec 20, 2011

We need to go farther than just validating IDs. Changing an archive.org id, even to another valid one, can break lending and other things.

We need to make archive.org IDs only editable by admins. Also, the Lending Library, In Library, and Protected Daisy subjects need to protected in the same way.

@tfmorris
Copy link
Contributor

This is still broken. Here's a recent example where someone added a full URL instead of just the ID: http://openlibrary.org/books/OL25424185M/The_Lunar_Basis_of_Myth_and_Symbol/edit

Apparently the IDs can be added (without validation), but not deleted or edited. Attempting to add a second IA ID fails silently with no error message.

@anandology
Copy link
Collaborator

we've disabled editing Internet Archive ID to avoid the ill effects of it. I'll work on validating it.

@LeadSongDog
Copy link

This problem still persists after five years. The IA identifier is not validated before first creating an edition record, nor is it checked for uniqueness and as having an extant IA target.

Example:
searching for an IAid may find it in two places, whereas it should be unique:
https://openlibrary.org/search?q=corollasanctiead00hervuoft finds it at both: https://openlibrary.org/works/OL11080612W/Corolla_Sancti_Eadmundi and at https://openlibrary.org/works/OL16742827W/Corolla_Sancti_Eadmundi by dint of edition records https://openlibrary.org/books/OL26221263M/Corolla_Sancti_Eadmundi and https://openlibrary.org/books/OL7041407M/Corolla_Sancti_Eadmundi respectively.

@mekarpeles
Copy link
Member

mekarpeles commented Feb 7, 2017

thank you for bumping this (cc: @bfalling).

@LeadSongDog, we're going to start doing community calls bi-weekly (we're still converging on a date/time -- those interested can vote here: http://doodle.com/poll/hrqphgaw2k9zhfbsyeuis638/admin#table).

We'd love you to join us if this aligns with your style, and alternatively, if you prefer, we invite you to submit an update prior to our meeting so we can represent the issues and features which are most important to you.

As per this issue, I'm hearing what you're saying -- I think it's worth discussing strategies for addressing this, whether it is through something like an edition merge (which is an available feature for admins) and whether we write some script to identify duplicates like the one you uncovered, or whether we build a check into the EDIT feature for an edition (i.e. we check against the API prior to approving the archive id).

I am going to defer to @bfalling for the prioritization. Right now we're working to fix our bookserver/opds service (as it stopped working when we decommissioned SOLR in favor of ES).

@LeadSongDog, thank you for your continued passion and contributions!

@mekarpeles mekarpeles added easy Priority: 1 Do this week, receiving emails, time sensitive, . [managed] labels Mar 23, 2017
@sbshah97
Copy link
Contributor

Is this Issue done @mekarpeles ?

@LeadSongDog
Copy link

LeadSongDog commented Feb 20, 2018

Still no one assigned, and still broken.
Some measures of success:

  1. Each and every OCAID seen in OL should be assigned to exactly one edition OLID.
  2. Each and every OCAID seen in OL should have a valid target available in IA.
  3. Each and every IA catalog record at these OCAIDs should have a back link to that edition OLID.
  4. The above can be demonstrated by an automated query.
  5. Items needing manual intervention can be automatically listed by 4 above.

@tfmorris
Copy link
Contributor

@bfalling was tagged for prioritization a year and a bit ago. Through what channel does he communicate IA's priority (at least I assume IA is what he's in charge of communicating priority for). Will he do that through the Tuesday development team meetings or some other channel?

@mekarpeles mekarpeles added metadata sync-ia-ol Theme: Identifiers Issues related to ISBN's or other identifiers in metadata. [managed] and removed Priority: 1 Do this week, receiving emails, time sensitive, . [managed] labels Mar 12, 2018
@mekarpeles
Copy link
Member

mekarpeles commented Mar 12, 2018

TL;DR -- issue still open.

@tfmorris, brenton's invited to join us on Tuesday so we can all make cases for the issues/features we feel are important. He and I will likely have a followup mtg later this week (Friday) w/ internal stakeholders to see if there are separate needs which require addressing. Next Tuesday (March 20) I'll hopefully have a Q2 github project board organized and we'll entertain a final round of community feedback so we can make sure everyone's feedback has been incorporated.

As per this specific issue, we do still have ocaids which get our of sync or are invalid and writing a checker for them wouldn't be too hard. A question is at what points should the check be called? Edit? Creation? We can also (at create time) add a check to make sure the ocaid is not already accounted for within OL.

The case of ocaids becoming stale (at least for now) I think should be addressed by editing the entry. We could hypothetically add this to our solr updater process or have some cron which checks periodically, but for the meantime, I think preventing duplicate add books is the more critical and the simpler of the two cases.

@mekarpeles mekarpeles self-assigned this Mar 12, 2018
@mekarpeles mekarpeles added the Good First Issue Easy issue. Good for newcomers. [managed] label Mar 12, 2018
@mekarpeles
Copy link
Member

It's unclear what resolution means in this context, let's please create another issue to address specific cases e.g. validate ocaid on edit or create.

@tfmorris
Copy link
Contributor

George had three very specific requests for the Edit Edition form. I don't know what she thinks, but I'd consider this resolved when those were done. They all sound like good ideas to me. If people want to 1+ things, sure, they should create new issues.

The more general issue of "Accept a full URL anyplace an identifier can be entered" deserves a separate issue for the cases other than OCAIDs on the Edit Edition page. I've created #866 for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue Easy issue. Good for newcomers. [managed] metadata Theme: Identifiers Issues related to ISBN's or other identifiers in metadata. [managed]
Projects
None yet
Development

No branches or pull requests

7 participants