-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics tab in portals fail for large collections #1346
Comments
This got a start on #1356 as well. This reduces the amount of Solr queries sent by the StatsModel/View from ten to two, and should drastically reduce the amount of time it takes to render the StatsView.
This bug fix is taking longer than expected because it required a refactor of the Stats Model. This is because Solr joins are the solution to this issue, and that requires a rewrite of all the queries and the way they are sent. And because Solr joins are not supported <Solr 4.4 (I believe - the old Solr release notes aren't exactly clear), I am being careful about keeping the The good news is that this refactor let me get started on #1356. Previously, the Stats model was sending ten separate requests to Solr to get all the stats for the view. I've reduced this to two. It's considerably faster. I don't want this to hold up the release of 2.11.0, so I put this in a feature branch for now and will try to get it into 2.11.1. |
… it is used by the Year filter in DataCatalogView Improved some areas of the StatsView when there is no metadata or data. Added 'isSystemMetadataQuery' attribute to the Stats model so it can skip the resourceMap join when it's not necessary. Added filter query to getMetadataStats() for the formatType and obsoletedBy fields. Ref #1346
This work is now complete. The StatsView now just sends two queries to Solr - one for the metadata stats and one for the data stats. When the StatsView is used for queries based on science metadata filters, it uses a Solr |
Example: https://knb.ecoinformatics.org/portals/sasap
MetacatUI constructs a query to send to Solr for the various metrics (e.g. total volume of data), which contains thousands of
OR
boolean clauses for each data pid in the portal. This causes a 500 server error because there are too many clauses. We need to find a way to reconstruct this query so it doesn't have to search by pid. With Metacat 2.13.0 upgrading to Solr 7, we may be able to take advantage of new query features (e.g. joins)The text was updated successfully, but these errors were encountered: