Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics tab in portals fail for large collections #1346

Closed
laurenwalker opened this issue Apr 20, 2020 · 2 comments
Closed

Metrics tab in portals fail for large collections #1346

laurenwalker opened this issue Apr 20, 2020 · 2 comments
Assignees
Labels
bug portals Anything related to portals
Milestone

Comments

@laurenwalker
Copy link
Member

Example: https://knb.ecoinformatics.org/portals/sasap

MetacatUI constructs a query to send to Solr for the various metrics (e.g. total volume of data), which contains thousands of OR boolean clauses for each data pid in the portal. This causes a 500 server error because there are too many clauses. We need to find a way to reconstruct this query so it doesn't have to search by pid. With Metacat 2.13.0 upgrading to Solr 7, we may be able to take advantage of new query features (e.g. joins)

@laurenwalker laurenwalker added bug portals Anything related to portals labels Apr 20, 2020
@laurenwalker laurenwalker self-assigned this Apr 20, 2020
@laurenwalker laurenwalker added this to the 2.11.0 milestone Apr 23, 2020
laurenwalker added a commit that referenced this issue May 5, 2020
This got a start on #1356 as well.
This reduces the amount of Solr queries sent by the StatsModel/View from 
ten to two, and should drastically reduce the amount of time it takes to 
render the StatsView.
@laurenwalker
Copy link
Member Author

This bug fix is taking longer than expected because it required a refactor of the Stats Model. This is because Solr joins are the solution to this issue, and that requires a rewrite of all the queries and the way they are sent. And because Solr joins are not supported <Solr 4.4 (I believe - the old Solr release notes aren't exactly clear), I am being careful about keeping the join queries as a configurable option for MetacatUI users who may need to update their Solr version. (More about this in the release notes later...)

The good news is that this refactor let me get started on #1356. Previously, the Stats model was sending ten separate requests to Solr to get all the stats for the view. I've reduced this to two. It's considerably faster.

I don't want this to hold up the release of 2.11.0, so I put this in a feature branch for now and will try to get it into 2.11.1.

@laurenwalker laurenwalker modified the milestones: 2.11.0, 2.11.1, 2.11.3 May 5, 2020
@laurenwalker laurenwalker modified the milestones: 2.11.3, 2.12.0 May 14, 2020
laurenwalker added a commit that referenced this issue Jun 1, 2020
… it is used by the Year filter in DataCatalogView

Improved some areas of the StatsView when there is no metadata or data.
Added 'isSystemMetadataQuery' attribute to the Stats model so it can 
skip the resourceMap join when it's not necessary.
Added filter query to getMetadataStats() for the formatType and 
obsoletedBy fields.
Ref #1346
@laurenwalker
Copy link
Member Author

laurenwalker commented Jun 1, 2020

This work is now complete. The StatsView now just sends two queries to Solr - one for the metadata stats and one for the data stats. When the StatsView is used for queries based on science metadata filters, it uses a Solr join on the resourceMap field. This fixes the broken Metrics tabs for portals.
joins can be disabled in the AppModel, which should only be the case for deployments that are using old versions of Solr (I believe joins were added in Solr 4.0.0-ALPHA). But note that deployments that turn off the Solr join feature may not show correct stats for portals, if the portal query is based on science metadata fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug portals Anything related to portals
Projects
None yet
Development

No branches or pull requests

1 participant