Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaperTrail adapter: poor version load performance with lots of versions #2808

Open
vergenzt opened this issue Jan 10, 2017 · 4 comments
Open

Comments

@vergenzt
Copy link

vergenzt commented Jan 10, 2017

TL;DR: Loading a version with the PaperTrail adapter incurs a COUNT(*) FROM versions WHERE item_type = 'Foo', which is slow as hell (> 30s) when there are lots of versions. Can we rewrite to avoid this?

I'd love to help fix, but could use some pointers. Any ideas? Are we allowed to just not pass Kaminari's paginated array a total_count? Can we pass it lazily somehow, so it only gets computed if someone calls total_count on the proxy array?


As our application's number of versions has grown (we're at ~3 million versions now), performance for the history tab has become abysmal, and is at the point that a single request can often take multiple minutes--which browsers usually time out, meaning we're simply unable to load that page.

To investigate, I looked at what queries the server makes when I click "View changes" for a specific version in the history tab. Here's an example:

SELECT COUNT(*) FROM "versions" WHERE "versions"."item_type" = 'Foo';
-- takes 26.6s

SELECT  "versions".* FROM "versions" WHERE "versions"."item_type" = 'Foo' ORDER BY created_at DESC LIMIT 20 OFFSET 0;
-- takes 24.3s with default settings
-- (was able to reduce to 0.3ms by indexing created_at)

SELECT  "versions".* FROM "versions" WHERE "versions"."id" = '2629542' LIMIT 1;
-- takes 71ms

SELECT "versions"."id" FROM "versions" WHERE "versions"."item_type" = 'Foo' AND "versions"."item_id" = '299882'  ORDER BY "versions"."id" ASC;
-- takes 98ms

The main unimproveable culprit is the SELECT COUNT(*) FROM "versions" ... query. Even with an index on item_type this query still takes a very long time since counting in Postgres always requires a full scan of the rows to be counted.

Tracing back to the source, this query appears to be caused by lib/rails_admin/extensions/paper_trail/auditing_adapter.rb, line 109:

def listing_for_model_or_object(model, object, ...)
  # ...

  versions = ... # an ActiveRecord::Relation
  paginated_proxies = Kaminari.paginate_array([], total_count: versions.try(:total_count) || versions.count)

  # ...
end

... which was introduced by #2054.

@vergenzt
Copy link
Author

@vlobanov any ideas as the author of the original PR (#2054)? I'm not sure I understand the context of why the versions array needed to be paginated.

@vlobanov
Copy link

It's been quite a while since the PR, but as far as I remember thing is that original implementation was using plain array:

versions.collect { |version| VersionProxy.new(version, @user_class) }

so pagination links were not working properly and you could not paginate versions at all, thus usage of Kaminari.paginate_array, and it requires count.
You could maybe monkey-patch the code to use arbitrary big number instead of count, sorry can't think of anything better

@vergenzt
Copy link
Author

vergenzt commented Feb 9, 2017

For what it's worth, I've introduced a tiny hacky workaround in my codebase to improve performance. Since Kaminari is only used by Rails Admin in my codebase, I've monkey patched the total_count method to truncate the count to at most 5000:

module Kaminari
  module ActiveRecordRelationMethods
    def total_count(column_name = :all, _options = nil)
      limit(5000).count(column_name)
    end
  end
end

This makes the result exact if the true count is <= 5000, but returns 5000 if the true count is > 5000. This takes way less time than counting all 3 million records, but still lets people scroll through the first handful of pages.

@vergenzt
Copy link
Author

FYI I added some related performance tips to the wiki: https://github.com/sferik/rails_admin/wiki/Performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants