-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I161 import collection resources #933
I161 import collection resources #933
Conversation
A user should be able to import a collection resource. In this commit, we are able to successfully import and create collection resources. From the console we can see the collection formed relatioships with works, but the frontend's count and display shows 0 relationships. Additionally, we are unable to re run the importer without receiving errors on the collection entry. TODO: specs, refactor, Issue: - scientist-softserv/hykuup_knapsack#161
…esources * hyrax-4-valkyrie-support: Adding index to schema ♻️ Reworking structure ♻️ Remove constant
…lkrax into hyrax-4-valkyrie-support
This refactor was necessary because even though klass == ImageResource, which inherits from Valkyrie::Resouce through it's chain, klass === Valkyrie::Resource was returning false.
This reverts commit 4ab31b6.
@@ -104,7 +104,7 @@ def run! | |||
|
|||
def update | |||
raise "Object doesn't exist" unless object | |||
destroy_existing_files if @replace_files && ![Collection, FileSet].include?(klass) | |||
destroy_existing_files if @replace_files && ![Collection, CollectionResource, FileSet].include?(klass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when CollectionResource is not defined? We need to favor a class method to answer the following:
Bulkrax.config.collection_classes
and Bulkrax.config.file_set_classes
, as we can't hard-code constants that may not exist in other non-Hyku 6 implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh yea, great point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be resolved with the changes you've made in your PR
@@ -102,7 +100,7 @@ def create_work(object:, attrs:) | |||
perform_transaction_for(object: object, attrs: attrs) do | |||
transactions["work_resource.create_with_bulk_behavior"] | |||
.with_step_args( | |||
"work_resource.add_to_parent" => { parent_id: attrs[:parent_id], user: @user }, | |||
"work_resource.add_to_parent" => { parent_id: attrs[related_parents_parsed_mapping], user: @user }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to test importing children instead addition to parents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confirmed that this works.
collection-works copy.csv
…esources * hyrax-4-valkyrie-support: Addressing TODO and minor refactoring ♻️ Favor Bulkrax.object_factory and add fault tolerance
* Revert "WIP - try to import filesets with valkyrie resources" This reverts commit 4ab31b6. * WIP * WIP - try to import filesets with valkyrie resources * 🚧 WIP: get filesets to import via bulkrax x valkyrie * 🎉 WIP: filesets to imports via bulkrax x valkyrie There's still a lot to clean up here, but the import is successful in this commit. * 💄 rubocop fixes * uncomment #get_s3_files call and add collections to configuration * Update object_factory.rb * ♻️ Move method and remove single instance definition I'm unclear why we were defining methods on the conf instance; especially given that these exist on the configuration. With this refactor, we're favoring using the Configuration object as the container. * Revert changes due to refactor coming in from main * address errors post big refactor * Refactoring for consistent method signatures Also avoiding setting an unused instance variable * 🐛 remove passing user to work_resource add_file_sets and save merge to strategies Importing a CSV of valkyrie works, collections, files and relationships is working at this point 🎉 * 🎁 Adding a new transaction step to handle different association * ♻️ Extract update_index method to object factory * ♻️ Extract object factory method * ♻️ Extract add_resource_to_collection method * ♻️ XIT out the mockery and stubbery of a spec * ♻️ Extract method publish and add_child_to_parent_work * ♻️ Rename method as it's not conditional Yes, it is conditional but it operates on arrays that could be empty. * Remove add to collection step * 🐛 Fix publish parameter mismatch * Removing custom transaction container. We weren't using it * Favor keyword args instead of hashes * 💄fixing typo * 🎁 Add update_collection to valkyrie object factory * 💄 endless and ever appeasing of the coppers --------- Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com>
* create an object factory that supports Valkyrie All code in this commit has been adapted from Surfliner: https://github.com/surfliner/surfliner-mirror * temp gem conflict workaround * ⚙️ upgrade dry-monads dependency to ~> 1.5.0 Hyrax 4.0.0 requires a dependency upgrade for dry-monads. I could not upgrade GBH's bulkrax without doing this change. - Issue: - scientist-softserv/ams#77 - Ref: - https://github.com/samvera/hyrax/blob/cbe9278b919485f90a37630d3f3157ecef59cd7c/hyrax.gemspec#L48 * 🧹 Add extra parameter for fill_in_blank_source_identifiers gbh got an error that we were passing too many arguments when setting the source_identifier in the bulkrax config. ref: - https://github.com/samvera-labs/bulkrax/wiki/Configuring-Bulkrax#source-identifier * Revert ":broom: Add extra parameter for fill_in_blank_source_identifiers" This reverts commit df96de6. * 🧹 delegate create_parent_child_relationships from importer to parser * allow ruby 3 syntax in migrations * 🧹 change exists? to exist? to support Ruby 3.2 * 🚧 add support for Hyrax 5, valkyrie and ruby 3.2 * add temp workaround for blank title and creator * ⚙️ Switch find methods with custom queries for Valkyrie * hyrax 4 permission service does both valk and non-valk * new bagit * handle validation failure * better failure detection for vaklyrie object * fix validation message * importer failure helpers * improve multiple detection in matchers * fix matcher on missing field * rob cant remember that its include? * Appeasing rubocop * ♻️ Handle exist? and/or exists? for finding objects See inline comments * Add dry/monads require for specs * I897 Bulkrax readiness for Hyku 6 and Hyrax 4 & 5 (#898) * 🧹 relocates transactions from inititalizer file Issue: - #897 Co-Authored-By: LaRita Robinson <laritakr@users.noreply.github.com> * 🧹 Add specs for container.rb, relocate files Co-Authored-By: LaRita Robinson <laritakr@users.noreply.github.com> * 🧹 normalize magic strings into constants for referencing later Convert the create_with_bulk_behavior and update_with_bulk_behavior to a constant; that way we can reference it in IiifPrint and document the “magic” string. Co-Authored-By: LaRita Robinson <laritakr@users.noreply.github.com> * 🧹 correct camel case to constant notation for easier referencing Co-Authored-By: LaRita Robinson <laritakr@users.noreply.github.com> * 💄 rubocop fixes Co-Authored-By: LaRita Robinson <laritakr@users.noreply.github.com> * Update app/factories/bulkrax/valkyrie_object_factory.rb * Update spec/bulkrax/transactions/container_spec.rb * 🧹 Move container & steps Match Hyrax convention by using bulkrax/transactions. * restructure org to run specs locally receiving error when trying to run the entire spec suite due to restructuring files but not moving the spec file. * 🚧 WIP: Consolidate HasMatchers with HasMappingExt Remove HasMappingExt and consolidate logic within HasMatchers. HasMatchers should handle both cases, when objects are ActiveFedora vs Valkyrie. * 🧹 Fix Specs & add Valkyrie Specs * 🧹 Fix Rubocop complaint * 🧹 Address Valkyrie's determination of multiple? * 🧹 Address permitted attributes In Valkyrie, we use the schema to identify the permitted attributes. All allowed attributes should be on the schema, so no additional attributes should be required. Also add a fallback for permitted attributes in case an ActiveFedora model class goes through the ValkyrieObjectFactory. This supports the case where we want to always force a Valkyrie resource to be created, regardless of the model name given. * 🧹 Update TODO comment Adjust TODO message because referring to a handler that doesn't exist anywhere is confusing. We may need to register steps for file sets once the behavior is implemented. --------- Co-authored-by: LaRita Robinson <laritakr@users.noreply.github.com> Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> Co-authored-by: LaRita Robinson <larita@scientist.com> * 📚 Adding documentation for configuration (#896) This builds on a [question asked in Slack][1] [1]: https://samvera.slack.com/archives/C03S9FS60KW/p1705681632335919 * ♻️ Extract Bulkrax::FactoryClassFinder (#900) This refactor introduces consolidating logic for determining an entry's factory_class. The goal is to begin to allow for us to have a CSV record that says "model = Work" and to use a "WorkResource". Note, there are downstream implementations that overwrite `factory_class` and we'll need to consider how we approach that. * 🐛 [i134] - Fix missing translations Missing translations were evaluating to false. Issue: - scientist-softserv/hykuup_knapsack#134 * Renaming method for parity * ♻️ Favor Bulkrax's persistence layer Instead of direct calls to a deprecated service favor a persistence layer call; one that defines an interface. Note this means we need to implement the methods in the Valkyrie adapter; but those should be trivial. * ♻️ Favor Bulkrax.persistence_adapter over ActiveFedora::Base * Moving methods to adapter pattern * use find_by_source_identifier instead of find_by_bulkrax_identifier (#907) * i903 - move bulkrax identifier custom queries into bulkrax move bulkrax identifier custom queries into bulkrax Issue: - scientist-softserv/hykuup_knapsack#136 * make find_by_source_identifier dynamic Import a csv with child works. The forming of relationships is not working. Part of the problem is the find_by_bulkrax_identifier call. From GBH, this used to be find_by_bulkrax_identifier which not all clients will configure as their source identifier. Instead we need to ask for the source identifier and use that for the sql query. This commit goes along with a PR from Hyku which currently has the find_by_source_identifier.rb files defined. Issue: - scientist-softserv/hykuup_knapsack#128 Co-Authored-By: Kirk Wang <k3wang@gmail.com> * remove files: they live in Hyku for now Co-Authored-By: Kirk Wang <k3wang@gmail.com> * 🧹 Place custom queries back in Bulkrax * 🧹 remove misleading comment Co-Authored-By: Kirk Wang <k3wang@gmail.com> * 🧹 Entry is a required argument when initializing ObjectFactory Fix for broken specs Co-Authored-By: Kirk Wang <k3wang@gmail.com> * revert changes to pass Entry arg The object factory already has work_identifier: parser.work_identifier. we don't need the entry argument after all. ref: - https://github.com/samvera/bulkrax/blob/main/app/models/concerns/bulkrax/import_behavior.rb#L181 Co-Authored-By: Kirk Wang <k3wang@gmail.com> --------- Co-authored-by: Kirk Wang <k3wang@gmail.com> Co-authored-by: Kirk Wang <kirk.wang@scientist.com> * 🧹 Make CreateRelationshipJob work for Valkyrie (#908) * 🧹 Make the relationships job work for Valkyrie This will add a relationships path for Valkyrie objects. It also will add a transactions call so set child flag will fire off in IIIF Print. ref: - scientist-softserv/hykuup_knapsack#141 * 💄 rubocop fix Co-Authored-By: Kirk Wang <k3wang@gmail.com> * ♻️ Adjust rescue logic to move closer to error This also adds some consideration for refactoring the queries to instead use the persistence layer. * Adding notes about transactions --------- Co-authored-by: Shana Moore <shana@scientist.com> Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> * Add todo comment Co-Authored-By: Kirk Wang <k3wang@gmail.com> * 🎁 Switch transaction to listener This commit will switch the membership transaction to a listener. * ♻️ Migrate persistence layer methods to object factory (#911) * ♻️ Migrate persistence layer methods to object factory In review of the code and in brief discussion with @orangewolf, the methods of the persistence layer could be added to the object factory. We already were configuring the corresponding object factory for each implementation of Bulkrax; so leveraging that configuration made tremendous sense. The methods on the persistence layer remain helpful (perhaps necessary) for documented reasons in the `Bulkrax::ObjectFactoryInterface` module. See: - #895 and its discussion * 🎁 Add Valkyrie object factory interface methods * 🧹 Favor interface based exception Given that we are not directly exposing ActiveFedora nor Hyrax nor Valkyrie objects, we want to translate/transform exceptions into a common exception based on an interface. That way downstream implementers can catch the Bulkrax specific error and not need to do things such as `if defined?(ActiveFedora::RecordInvalid) rescue ActiveFedora::RecordInvalid` It's just funny looking. * 🧹 Get exporters to work This commit contains various changes to get the exporters to work correctly. * make updates work * 🧹 Make DeleteJob work wth new class method .find (#912) * 🧹 Make DeleteJob work wth new class method .find The DeleteJob previously was not working with the old factory#find method because when it is doing a delete action, the parsed_metadata does not get generated like during a regular import. Because of this, the #search_by_identifier method fails to find anything because we don't have a `work_identifier` field which would have came from the parsed_metadata. So instead, we are using the new class method .find which will take an id (which we find on the raw_metadata) to find the object. We make sure to reindex and publish the action to any relevant listeners. * 🎁 Implement a #delete method for the ObjectFactory This commit will add a delete method to the ObjectFactory and the ValkyrieObjectFactory so we can avoid unnecessary conditionals. * 🧹 Rework factories to implement delete method This cuts down on the method chaining. * ♻️ Remove constant This creates hard to parse chatter, and is not needed as we were relying on it for IIIF Print to be able to reference. * ♻️ Reworking structure The Hyrax transactions create a lot of pre-amble and post-amble for performing the save. This commit attempts to consolidate logic to reduce redundancy of that boilerplate. Further, it adds handling for creating collections. We still need to handle form validation. * Adding index to schema * ♻️ Favor asking about model_name over class (#934) Given our effort at lazy migration in Bulkrax we want to do a bit more sniffing regarding the objects. This is not quite adequate for the general case of Collections but it is an improvement. Ideally we should be interrogating the class and asking `klass.collection?` but there are some confounding edge cases around routing that we are in this pickle. ```ruby irb(main):002:0> CollectionResource.model_name => @collection="collections", @element="collection", @Human="Collection", @i18n_key=:collection, @klass=CollectionResource, @name="CollectionResource", @param_key="collection", @plural="collections", @route_key="collections", @Singular="collection", @singular_route_key="collection"> irb(main):003:0> Collection.model_name => @collection="collections", @element="collection", @Human="Collection", @i18n_key=:collection, @klass=Collection, @name="Collection", @param_key="collection", @plural="collections", @route_key="collections", @Singular="collection", @singular_route_key="collection"> irb(main):004:0> ``` * Favor object factory for find * ♻️ Fix return value of transaction create * Correct Hyrax.object_factory -> Bulkrax.object_factory * Download cloud files later (#930) * 🎁 Reschedule ImporterJob if downloads aren't done This commit will add a check in the `ImporterJob` to see if the cloud files finished downloading. If they haven't, the job will be rescheduled until they are. * 🎁 Download Cloud Files later This commit will bring in changes from `5.3.1-british_library` to move the download of cloud files to a background job. --------- Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> * ♻️ Favor configuration over hard-coding and reaching assumptions The main "flip" of logic is that we can remove the `curation_concern?` method because we can instead ask "if Collection || FileSet" and infer when that is false that we have a work. This means removing the very reaching assumption of Hyku and it's implementation foibles for work types. * ♻️ Extract Bulkrax.collection_class_method Instead of relying on the hard-coding, allow for configuration. Co-authored-by: Shana Moore <shana.lavina.moore@gmail.com> * ♻️ Favor Bulkrax.collection_model_class Co-authored-by: Shana Moore <shana@scientist.com> * ♻️ Favor Bulkrax.object_factory.find Instead of relying on the direct call to a constant. Co-authored-by: Shana Moore <shana@scientist.com> * ♻️ Extract Bulkrax.object_factory.save! method for We have a place where we try to call save! directly. We do need to pass a user for save event; hence the added method. * ♻️ Favor using object_factory for save! Co-authored-by: Shana Moore <shana@scientist.com> * ♻️ Extract Hyrax.object_factory.search_by_property There is a duplication of this logic elsewhere, but I first wanted to extract common logic then begin extracting full replacement and conforming object interface for Valkyrie. * ♻️ Extract method for Valkyrization We cannot directly query the class. But must instead favor the object_factory. * 🎁 Adding query for find_by_model_and_property_value * ♻️ Remove custom Valkyrie search_by_identifer The super method was refined to use the class object factory; making it redundant and flexible in the same manner as `Bulkrax::ObjectFactory#search_by_identifer`. * ♻️ Favor internal_resource definitions (when available) * ♻️ Extract internal_resources method for curation concerns * ♻️ Favor Bulkrax.object_factory and add fault tolerance * Addressing TODO and minor refactoring * I161 import collection resources (#933) * 🚧 WIP: Import Collection Resource A user should be able to import a collection resource. In this commit, we are able to successfully import and create collection resources. From the console we can see the collection formed relatioships with works, but the frontend's count and display shows 0 relationships. Additionally, we are unable to re run the importer without receiving errors on the collection entry. TODO: specs, refactor, Issue: - scientist-softserv/hykuup_knapsack#161 * remove unused code * refactor #conditionally_destroy_existing_files This refactor was necessary because even though klass == ImageResource, which inherits from Valkyrie::Resouce through it's chain, klass === Valkyrie::Resource was returning false. * exclude CollectionResource class from #destroy_existing_files * WIP - try to import filesets with valkyrie resources * Revert "WIP - try to import filesets with valkyrie resources" This reverts commit 4ab31b6. * 💄 rubocop fix * i162 - import valkyrie works with filesets (#936) * Revert "WIP - try to import filesets with valkyrie resources" This reverts commit 4ab31b6. * WIP * WIP - try to import filesets with valkyrie resources * 🚧 WIP: get filesets to import via bulkrax x valkyrie * 🎉 WIP: filesets to imports via bulkrax x valkyrie There's still a lot to clean up here, but the import is successful in this commit. * 💄 rubocop fixes * uncomment #get_s3_files call and add collections to configuration * Update object_factory.rb * ♻️ Move method and remove single instance definition I'm unclear why we were defining methods on the conf instance; especially given that these exist on the configuration. With this refactor, we're favoring using the Configuration object as the container. * Revert changes due to refactor coming in from main * address errors post big refactor * Refactoring for consistent method signatures Also avoiding setting an unused instance variable * 🐛 remove passing user to work_resource add_file_sets and save merge to strategies Importing a CSV of valkyrie works, collections, files and relationships is working at this point 🎉 * 🎁 Adding a new transaction step to handle different association * ♻️ Extract update_index method to object factory * ♻️ Extract object factory method * ♻️ Extract add_resource_to_collection method * ♻️ XIT out the mockery and stubbery of a spec * ♻️ Extract method publish and add_child_to_parent_work * ♻️ Rename method as it's not conditional Yes, it is conditional but it operates on arrays that could be empty. * Remove add to collection step * 🐛 Fix publish parameter mismatch * Removing custom transaction container. We weren't using it * Favor keyword args instead of hashes * 💄fixing typo * 🎁 Add update_collection to valkyrie object factory * 💄 endless and ever appeasing of the coppers --------- Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> --------- Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> * ♻️ Extract logic for add_user_to_collection_permissions * 📚 Tidying documentation * ♻️ Refactor Object Factories to leverage more inheritance * ♻️ Extract abstract class for ObjectFactory In constructing object inheritance, a more robust strategy is to create an abstract class and then have classes directly extend that abstract class. This helps define and narrow an interface. * ♻️ Move method to interface This is used in both ObjectFactory and ValkyrieObjectFactory * ♻️ Organizing code for Valkyrie Object Factory * Refactoring method names for sorting order * ♻️ Handle Valkyrie::Resource situation * ♻️ Puzzling through implementation details * ♻️ Extract method to enable removal of conditionals * ♻️ Extract FileFactory::InnerWorkings The goal of this extraction is to minimize the exposed interface of what is quite complicated and state dependent logic. * ♻️ Refactor to extract local variable * Adding class attribute for Bulkrax::FileFactory * ♻️ Adding inner methods for file factory interaction * 🐛🏳️ post Big refactor fixes Refactoring caused some bugs. At this point we are able to successfully import CSV works again. * fix typo * 🧹 Add case for `'collectionresource'` In Valkyrie Hyku we're using CollectionResource and this was not being recognized by the CSV parser. * reload the object before calling persisted? on it resolves failure saying that errors is undefined. object.persisted? returned false even though we could see that they got created in the UI. * 💄 rubocop fix * 🐛 Add return in ObjectFactory if valkyrie Adding this early return here so we don't go down to the the #where and trigger a NoMethodError. What it seems like it's doing is checking Postgres for the object but if it doesn't find it then tries in Fedora, however, Valkyrie object don't respond to #where so it throws an error. * save parent object to establish relationships This fixes the reason why works weren't forming relationships with other works * Add FileSet branch to coercer conditional This is in prep to handle Hyrax::FileSets being imports as rows. * Add commit to clarify casecmp in CsvParser * 🎁 Add ability to use tar.gz files This commit will allow users to use tar.gz files as well as zip files for importing. * 🐛 Changing guard to #respond_to?(:where) A spec was failing with the previous way we were checking. * 🎁 Change glyphicon to font awesome Hyrax 4+ applications use font awesome and not glyphicon. This commit will convert all glyphicon to font awesome. * Add require ruby-progressbar (#942) Update bulkrax_tasks.rake Fixes #941 * 🐛 Ensure we include visibility and other keywords for collection Related to: - scientist-softserv/hykuup_knapsack#182 Co-authored-by: LaRita Robinson <laritakr@users.noreply.github.com> * 🐛 Fix visibility check on the object This commit will add a guard for visibility because it is not on a valkyrie resource. * 🐛 Save provided visibility from CSV CSV provided visibility was being clobbered in the ImportCollectionJob. Refs scientist-softserv/hykuup_knapsack#182 * ♻️ Extract methods for better composition * ♻️ Extracting object factory methods I want to avoid having conditionals regarding object factories. This violates the polymorphism and means that other implementors that choose a different `Bulkrax.object_factory` will have unintended consequences. * 💄 endless and ever appeasing of the coppers * ♻️ Favor object factory over hard-coded * Amend the see/refer documentation for parser * 💄 endless and ever appeasing of the coppers * Updating test schema * Remove transactions from initialization * ♻️ Remove explicit calls to AdminSet * 📚 Adding TODO items --------- Co-authored-by: Benjamin Kiah Stroud <32469930+bkiahstroud@users.noreply.github.com> Co-authored-by: Rob Kaufman <rob@notch8.com> Co-authored-by: Kirk Wang <kirk.wang@scientist.com> Co-authored-by: Jeremy Friesen <jeremy.n.friesen@gmail.com> Co-authored-by: LaRita Robinson <laritakr@users.noreply.github.com> Co-authored-by: LaRita Robinson <larita@scientist.com> Co-authored-by: Kirk Wang <k3wang@gmail.com> Co-authored-by: Dan Kerchner <kerchner@users.noreply.github.com>
This work allows for valkyrie collections to get imported via bulkrax with works. filesets will be handled in a different pr.
Issue: