Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indexing for Hyrax::FileSet through valkyrie indexing #4780

Merged
merged 2 commits into from
Mar 16, 2021

Conversation

elrayle
Copy link
Contributor

@elrayle elrayle commented Mar 9, 2021

Fixes #4766

Description

This creates indexing for Hyrax::FileSet, which is a Valkyrie::Resource, using the Valkyrie indexing process called from the MetadataIndexListener.

Questions

There are two questions posed in the comments in the file changes.

  • Is the refactor of Hyrax::FileSetIndexer to Hyrax::ActiveFedoraFileSetIndexer acceptable? It is necessary for Valkyrie indexing naming conventions, but does represent a public API change.
  • Do we want to index all file metadata?

Known remaining issues

Jobs that need to add support for Valkyrie

Valkyrie works do not include FileSets as members in solr docs.

@samvera/hyrax-code-reviewers

@elrayle elrayle force-pushed the fix/4766_fileset_valk-solr branch 7 times, most recently from 1f649a6 to e2bb038 Compare March 12, 2021 17:06
##
# Indexes Hyrax::FileSet objects
class FileSetIndexer < Hyrax::ValkyrieIndexer
include Hyrax::ResourceIndexer
Copy link
Contributor Author

@elrayle elrayle Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the refactor of Hyrax::FileSetIndexer to be the indexer for Hyrax::FileSet (valkyrie resource) instead of ::FileSet (AF object). This is needed because Hyrax::ValkyrieIndexer #for method uses naming convention to determine the indexer to use for a given Valkyrie resource.

solr_doc['type_tesim'] = file_metadata.type if file_metadata.type.present?

# attributes set by fits
solr_doc['format_label_tesim'] = file_metadata.format_label if file_metadata.format_label.present?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include all file metadata in the solr_doc? Right now, that is what this does.


# attributes set by fits
solr_doc['format_label_tesim'] = file_metadata.format_label if file_metadata.format_label.present?
solr_doc['size_tesim'] = file_metadata.size if file_metadata.size.present?
Copy link
Contributor Author

@elrayle elrayle Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size is recorded in the solr doc twice.

  • At line 37, it is recorded as a long. (Consistent with how it is done in the AF generated solr doc.)
  • Here is is recorded as a string. (Consistent with how all the other FileMetadata fields are indexed.)

@@ -6,7 +6,7 @@ module Indexing

included do
# the default indexing service
self.indexer = Hyrax::FileSetIndexer
self.indexer = Hyrax::ActiveFedoraFileSetIndexer
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completes the refactor moving the AF indexing to Hyrax::ActiveFedoraFileSetIndexer (for AF objects) out of Hyrax::FileSetIndexer (now being used for Valkyrie resource Hyrax::FileSet).

@@ -134,4 +134,114 @@
expect(actor.send(:validate_remote_url, URI('https://example.com/test.txt'))).to be true
end
end

context 'when work is a valkyrie resource' do
let(:work) { valkyrie_create(:monograph) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the same tests as above, but run with a Valkyrie resource instead of AF Object.

RSpec.describe Hyrax::ActiveFedoraFileSetIndexer do
include Hyrax::FactoryHelpers

let(:file_set) do
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a straight up move of the code that was in file_set_indexer_spec.rb (which used to test the indexer for AF ::FileSet, but now tests indexer for Valkyrie Hyrax::FileSet).

@@ -2,148 +2,249 @@
RSpec.describe Hyrax::FileSetIndexer do
include Hyrax::FactoryHelpers

let(:fileset_id) { 'fs1' }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec used to test the indexer for the AF ::FileSet, but now tests the indexer for the Valkyrie Hyrax::FileSet.

@elrayle elrayle changed the title WIP: index Hyrax::FileSet in the valkyrie index Add indexing for Hyrax::FileSet through valkyrie indexing Mar 12, 2021
module Hyrax
class FileSetIndexer < ActiveFedora::IndexingService
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm worried about this name swap. this would be a high-impact breaking change, right?

@@ -87,7 +87,7 @@ def attach_to_work(work, file_set_params = {})
def attach_to_valkyrie_work(work, file_set_params)
work = Hyrax.query_service.find_by(id: work.id) unless work.new_record
file_set.visibility = work.visibility unless assign_visibility?(file_set_params)
Hyrax.persister.save(resource: file_set)
@file_set = Hyrax.persister.save(resource: file_set)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say more about this instance variable?

it will be cached with this actor in the ActionDispatch::MiddlewareStack pool, which will hold it in memory until the next time its set. this can lead to unexpected behavior. see #3722

@@ -61,9 +61,17 @@ def attach_files(env, remote_files)
true
end

def create_file_from_url(env, uri, file_name, auth_header)
if env.curation_concern.is_a? Valkyrie::Resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is small and optional, but i'm preferring switch statements for behavior that's switching on type:

case env.curation_concern
when Valkyrie::Resource
   create_file_from_url_through_valkyrie(env, uri, file_name, auth_header)
else
   create_file_from_url_through_active_fedora(env, uri, file_name, auth_header)
end

@elrayle elrayle force-pushed the fix/4766_fileset_valk-solr branch 2 times, most recently from a124f4b to ab248f5 Compare March 12, 2021 20:45
app/actors/hyrax/actors/file_actor.rb Show resolved Hide resolved
solr_doc['type_tesim'] = file_metadata.type if file_metadata.type.present?

# attributes set by fits
solr_doc['format_label_tesim'] = file_metadata.format_label if file_metadata.format_label.present?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include all file metadata in the solr_doc? Right now, that is what this does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 seems okay to add a lot now and try to tidy up another time.

app/indexers/hyrax/valkyrie_indexer.rb Outdated Show resolved Hide resolved
@elrayle elrayle force-pushed the fix/4766_fileset_valk-solr branch 2 times, most recently from 8cd3494 to 104caec Compare March 12, 2021 21:57
This adds a basic solr doc for Hyrax::FileSet which is a Valkryie::Resource.
solr_doc['type_tesim'] = file_metadata.type if file_metadata.type.present?

# attributes set by fits
solr_doc['format_label_tesim'] = file_metadata.format_label if file_metadata.format_label.present?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 seems okay to add a lot now and try to tidy up another time.

app/models/hyrax/work.rb Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a FileSet solr doc in Valkyrie solr when files are added to a Work
2 participants