Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: refactor DistributionsByCountry#report using raw SQL query #4841

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

coalest
Copy link
Collaborator

@coalest coalest commented Dec 10, 2024

Doesn't resolve any issue.

Description

One of the pages that was reported to be slow in production is the distributions_by_country#report page.

I tried out two approaches:

  1. Fixing the N+1's found in the controller action.
  2. Rewriting the service logic in raw SQL.

Approach number 2 seems faster, but potentially less maintainable. Approach number 1 still helps dramatically though.

Request times mentioned below are just for comparison purposes (I still just have the seed data locally).

Type of change

  • Internal refactor (behavior should be identical for users)

How Has This Been Tested?

  • Transferred the old service specs to be query specs.
  • Tested locally

Screenshots

Before any changes (~280 ms):

Screenshot from 2024-12-10 17-24-20

After fixing the N+1's using by adding associations to includes (~120 ms):

Screenshot from 2024-12-10 17-24-09

View diff of what fixing N+1's look like
diff --git a/app/controllers/distributions_by_county_controller.rb b/app/controllers/distributions_by_county_controller.rb
index 715dc725..28ab4fe5 100644
--- a/app/controllers/distributions_by_county_controller.rb
+++ b/app/controllers/distributions_by_county_controller.rb
@@ -4,7 +4,11 @@ class DistributionsByCountyController < ApplicationController

 def report
   setup_date_range_picker
-    distributions = current_organization.distributions.includes(:partner).during(helpers.selected_range)
+    distributions = current_organization
+      .distributions
+      .includes(partner: { profile: {served_areas: [:county] }})
+      .includes(line_items: [:item])
+      .during(helpers.selected_range)
   @breakdown = DistributionByCountyReportService.new.get_breakdown(distributions)
 end
end

After rewriting with raw SQL (~60 ms):

Screenshot from 2024-12-10 17-24-34

@coalest coalest requested a review from dorner December 10, 2024 16:45
@cielf
Copy link
Collaborator

cielf commented Dec 10, 2024

i should review the functionality once any changes re approach are made. (No point until then)

@coalest coalest force-pushed the improve-distributions-by-country-report-performance branch from 226d9c9 to 245a805 Compare December 10, 2024 19:56
Copy link
Collaborator

@dorner dorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the approach overall!

@@ -0,0 +1,78 @@
class DistributionSummaryByCountyQuery
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this needs to be an instance instead of just having a class function? It's easier to test, stub and work with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we return a data class instead of a hash?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this needs to be an instance instead of just having a class function? It's easier to test, stub and work with.

No real reason, current queries in this folder vary (sometimes call is an instance method sometimes a class method).

Changed it to be a class method.

Copy link
Collaborator Author

@coalest coalest Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we return a data class instead of a hash?

Fixed.

@coalest coalest requested a review from dorner December 11, 2024 12:20
Copy link
Collaborator

@dorner dorner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, although I'm a bit concerned at the test changes. Performance-only fixes shouldn't need to change tests - by definition the way you know it didn't break anything is when the exact same tests pass as before. Can you explain why they were changed?

@coalest
Copy link
Collaborator Author

coalest commented Dec 13, 2024

I didn't touch the related system or request specs for this report/page. But since I moved the core logic from a service to a query object, so I figured some refactoring of the test was ok. I left the actual assertions the same, but I refactored the setup.

The old setup was including setup from elsewhere (include_examples "distribution_by_county"), and it creates some records which weren't needed (user/org_admins). I prefer writing specs where you can see all of the setup in the file rather than DRYing it up which tends to complicate things.

@dorner
Copy link
Collaborator

dorner commented Dec 14, 2024

@coalest I'd really appreciate it if for this PR you could leave the tests as is so we can validate the success of the goal of this PR, which is improving performance without changing anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants