-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow CharacterizeJob to receive either an AF object or a Valkyrie resource as the FileSet #4100
Labels
Comments
This was referenced Oct 21, 2019
Closed
Closed
Allow RevokeEditToMembersJob to receive either an AF object or a Valkyrie resource as the Work
#4103
Closed
mjgiarlo
added a commit
that referenced
this issue
Jan 23, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
mjgiarlo
added a commit
that referenced
this issue
Jan 23, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
mjgiarlo
added a commit
that referenced
this issue
Jan 23, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
no-reply
pushed a commit
that referenced
this issue
Jan 24, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
fritzfreiheit
pushed a commit
that referenced
this issue
Jan 30, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
fritzfreiheit
pushed a commit
that referenced
this issue
Jan 31, 2020
While looking into #4100, we noticed that `CharacterizeJob` not only saves a file set's `original_file`, but also reindexes the file set, *and* reindexes every collection to which the file set's work belongs. It appears that this feature was added way back in the Sufia days in order to support the use case of displaying on the collection show page the total size of all files uploaded to a collection's works. Why is this problematic? At large scale, this means touching a potentially huge portion of the repository—iterating over all files in all works in a collection—in order to display an arguably useful string to a subset of repository users who care about such information. Doing this in `CharacterizeJob` introduces significant potential delay in the ingest pipeline, increasing how long it takes for a deposit to finish and render in the Hyrax UI (e.g., by delaying the creation of derivatives). We propose removing this feature for now and encourage folks who need this feature to contribute it back with a better-performing, more streamlined design.
I think this might be a wontfix, since there's now a characterization infrastructure that works with Valkyrie and is used for Valkyrie driven objects by default |
This work was done in #5321 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Descriptive summary
Convert CharacterizeJob to allow it to receive either an ActiveFedora based FileSet or a Valkyrie::Resource equivalent FileSet.
Rationale
Allow jobs to execute correctly with existing AF implementation or the wings Valkyrie implementation.
Expected behavior
CharacterizeJob receives a Valkyrie::Resource and performs the characterization.
Actual behavior
CharacterizeJob raises an exception when passed a Valkyrie::Resource.
Approach for Conversion
See Pattern for Jobs Receiving Parameter that may be an AF Object or a Valkyrie Resource
Steps to reproduce the behavior
See example of processing both objects in... file_set_actor_spec.rb
For testing this job, you can use something like...
OR you can write tests that are specific to each object type, similar to the tests in spec/jobs/hyrax/revoke_edit_job_spec.rb. This approach doubles the number of tests and doesn't provide certainty that the new code works exactly the same as the old code if the two versions of the tests diverge. The approach to take is at the discretion and judgement of the developer.
Related work
These have all be modified to support AF object or Valkyrie resource using the [established pattern]((https://docs.google.com/document/d/1f9yDjtkXapFf1DYLyPhprnB4CPGaZw9p_k8T_tcor-U/edit#heading=h.qt9rxydfd727).
The text was updated successfully, but these errors were encountered: