Admin report for domains which should have data source restriction FF enabled #34813

Charl1996 · 2024-06-25T08:02:53Z

Product Description

A new admin report is added to allow us to identify domains which have large amounts of data (forms or cases) which might cause data source rebuilds to be very slow. We use a heuristic (number of cases/form > 1,000,000) to determine whether or not those domains should have the RESTRICT_DATA_SOURCE_REBUILD feature flag enabled.

The report looks like this:

The column, UCR rebuild restriction status (feel free to suggest other names), shows the user whether the domain

is currently being restricted from rebuilding data sources (i.e. have the FF enabled)
is currently not being restricted from rebuilding UCRs (i.e. does not the FF enabled)
should be restricted from rebuilding data sources (i.e. should have the FF enabled)
should not be restricted (i.e. should have the FF disabled; although I don't think well necessarily have a case like this)

When the FF should be enabled or disabled, the column will have a link (see image above) to the FF page.

In addition, a weekly email is sent to a particular admin email (soltech AEs) using this report's data to identify domains which we should look at. The email links to this report.

Technical Summary

Ticket

I decided to add a separate class, UCRRebuildRestrictionTable, handling the table feeding the UCRDataLoadReport report. This was decided so that we can easily have access to the report data when sending the email (e.g. specifying in the email the number of identified domains based on a certain filter) without having to create a mock request object as is needed when initializing a report class.

Feature Flag

No particular feature flag

Safety Assurance

Safety story

Tested locally. Added some tests

Automated test coverage

Added automated tests

QA Plan

No QA from QA team, but will have AE's test it.

Rollback instructions

This PR can be reverted after deploy with no further considerations

Labels & Review

Risk label is set correctly
The set of people pinged as reviewers is appropriate for the level of risk of the change

Charl1996 · 2024-06-25T08:04:21Z

corehq/apps/hqwebapp/tasks.py

+        restriction_ff_status=UCRRestrictionFFStatus.ShouldEnable.name,
+    )
+    subject = "Weekly report: projects for ucr restriction FF"
+    recipient = "solutions-tech-app-engineers@dimagi.com"


Now that I review this, this should probably be an environment variable.

Do we already have this email? if not, might as well use the soltech email we used to report issues to soltech only.

mkangia

Nice PR @Charl1996 , well drafted.

Just wanted to check about pagination on this report. I see we have added 4 options for filter status that we are using to choose to show domains. Is that to work with the pagination or the pagination is not supported on this report?

If you are going to use pagination, Just noting that any filter on the report should be applied when fetching the records to show, then paginated and then they should not be reduced.

A simple code improvement could be do order methods in order of use, the Newspaper metaphor of clean code.

mkangia · 2024-06-25T09:05:27Z

corehq/apps/hqadmin/reports.py

+    restriction_ff_status: str
+
+    def __init__(self, *args, **kwargs):
+        self.selected_domain = kwargs.get('selected_domain')


you can select only one domain?

As of now, yes.

I had a chat with AE and the domain filter is redundant.

@Charl1996 Could you please update the screenshot in the PR description to show the updated report filters since the domain filter has now been removed?

Yes! Let me do that now

mkangia · 2024-06-25T09:08:32Z

corehq/apps/hqadmin/reports.py

+
+        for domain in self.ucr_domains():
+            case_count = CaseES().domain(domain).count()
+            form_count = FormES().domain(domain).count()


I am worried about the number of queries this would fire up. Lets do this in one go?
Should be straight forward to add a new filter that does that like we have for case_ids filters.term('_id', case_ids).

Cool, will look into this a bit.

mkangia · 2024-06-25T09:10:26Z

corehq/apps/hqadmin/reports.py

+
+    @staticmethod
+    def _rebuild_restricted_ff_enabled(domain):
+        return RESTRICT_DATA_SOURCE_REBUILD.enabled(domain)


Can we memoize RESTRICT_DATA_SOURCE_REBUILD.get_enabled_domains() so we do this only once?

mkangia · 2024-06-25T09:17:23Z

corehq/apps/hqwebapp/tasks.py

+        restriction_ff_status=UCRRestrictionFFStatus.ShouldEnable.name,
+    )
+    subject = "Weekly report: projects for ucr restriction FF"
+    recipient = "solutions-tech-app-engineers@dimagi.com"


Do we already have this email? if not, might as well use the soltech email we used to report issues to soltech only.

mkangia · 2024-06-25T09:19:45Z

corehq/apps/hqwebapp/tasks.py

+
+    filter_name = UCRRebuildStatusFilter.slug
+    filter_value = UCRRestrictionFFStatus.ShouldEnable.name
+    report_url = f"{protocol}://{host}{endpoint}?{filter_name}={filter_value}"


does not reverse already handle this for you? Is the function get_url_base doing what you need?

get_url_base does exactly what I need! Thanks!

Charl1996 · 2024-06-27T11:07:23Z

@mkangia

Do we already have this email? if not, might as well use the soltech email we used to report issues to soltech only.

I see that goes to solutions-feedback@dimagi.com

commcare-hq/corehq/apps/hqwebapp/views.py

Line 890 in f838c3e

return 'solutions-feedback@dimagi.com'

As I don't think it would make sense to send this report to the feedback email I think it's probably fine as is?

Charl1996 · 2024-06-27T13:32:35Z

Please note that I force pushed due to lots of small updates baked into existing commits.

mkangia · 2024-06-27T16:13:50Z

As I don't think it would make sense to send this report to the feedback email I think it's probably fine as is?

Oh ya, feedback email should not be used.
Though, instead of solutions-tech-app-engineers@dimagi.com we could use soltech-app-engineers@dimagi.com.

mkangia · 2024-06-27T16:19:27Z

@Charl1996 please let me know when its ready for review again. I believe you would either respond to the pagination query I had or update code accordingly.

Charl1996 · 2024-06-27T17:24:00Z

@mkangia

@Charl1996 please let me know when its ready for review again. I believe you would either respond to the pagination query I had or update code accordingly.

Ah, I forgot to push the "re-request review" button. It's ready for review. Thanks for checking in.

ajeety4 · 2024-06-28T07:40:09Z

corehq/apps/hqadmin/reports.py

+            self.restrict_rebuild_column_data(domain, case_count, form_count),
+        ]
+
+    def restrict_rebuild_column_data(self, domain, case_count, form_count):


nit: rename to should_restrict_rebuild_column_data

Actually, now that I think about it, how does _ucr_rebuild_restriction_status_column_data sound? It's more in line with the column name.

Sounds much better 👍

mkangia · 2024-06-28T15:40:29Z

corehq/apps/hqadmin/reports.py

+    def rows(self):
+        start = self.pagination.start
+        end = self.pagination.start + self.pagination.count
+        return self.table_data.rows[start:end]


does this mean that we are creating all rows on each page and then picking the ones for the page?
Don't think we should be doing that.

Honestly, I don't think we need pagination on this report since we are only fetching domains with UCR feature flag enabled which should not be many.

@mkangia

does this mean that we are creating all rows on each page and then picking the ones for the page?

Correct. I do feel slightly uncomfortable about that, but I'm not sure how else I could achieve pagination, considering that I need to pull the domains first and then check if the domain applies based on the filter value. In this sense I need to construct all the applicable rows, but then paginate for the UI.

Note: I did check with an AE on whether the filters are necessary in the first place (since they seem kinda nice-to-have) but it was confirmed that they would be very helpful.

I don't think we need pagination on this report since we are only fetching domains with UCR feature flag enabled which should not be many.

The domains on prod having this FF is already quite long.

The domains on prod having this FF is already quite long.

Good call on checking that, it is a big list. And that is the reason the current approach won't work, because it will fetch data for all of them and then do the pagination. So, on production, this would have performance issues.

I think the choice here is to

get rid of filters for status (you could add them as a column instead so that AE still knows it but can't filter by it)

get rid of pagination

one small correction here / correct me if I am wrong:

The domains are iterated and a subset of domains are selected based on the filters (this is something that needs to happen on the application level as there's various things to check)

The domains are iterated and data (case and form count) is fetched for all of them one at a time.
For each domain, It is after the data fetch, we decide if we want to list the domain or not based on filters.
And then once we have the final list of valid domains based on filters, we then paginate.

I agree that performance implication isn't noticeable on staging and then possibly would be same on production.
But I would not consider that as a validation to go with this approach.

If you look at the code, it is like, if these were SQL records, you are fetching all records from DB and then paginating using a list, instead of asking SQL to give you a set of records. And its reasonable because the table is small?
So, I believe, irrespective of the performance implication, you wont go with the above approach to paginate the list instead of via SQL.

The program is fetching and processing all domains for each page anyway, so if we just skip the pagination, the page itself won't become any fast for any single page, but the user won't do multiple requests (different pages) anymore.
So, if you wish to keep the filters, we should skip pagination. Initially, we noticed that the number of UCRs is high, but then the program is going through all of them anyway even now.

The domains are iterated and data (case and form count) is fetched for all of them one at a time.

This is not true anymore. It was initially the way I did it by changed it here. Now all the domains' case data is fetched in one query to ES and the forms data is fetched in another query to ES.

The program is fetching and processing all domains for each page anyway, so if we just skip the pagination, the page itself won't become any fast for any single page, but the user won't do multiple requests (different pages) anymore.

This is true. I see your point. Yeah, so if we skip pagination and keep the filters I can now see how that would be an improvement to the current approach. I'll update

Reverted pagination here.

I noticed that the page still looks paginated (maybe this is the standard look) but all records are loaded in the table, since there's no data loading going on when I switch between pages.

Do you need to set disable_pagination=True on the class?

This reverts commit 72e3af4.

corehq/apps/hqadmin/reports.py

zandre-eng · 2024-07-02T11:27:04Z

@Charl1996 Thinking through this report, I'm wondering if there would be any value in having an action column as well? This could be even only be enabled for relevant rows that have a "Should Disable" or "Should Enable". Clicking on the button in the new column would then make an Ajax call to the back-end and add/remove the domain from the feature flag.

This feels more like a nice-to-have, but it could make things a bit easier by not having to go and manually add/remove domains under the feature flags page.

Charl1996 · 2024-07-02T13:30:28Z

@zandre-eng Currently I do something like that, but not as explicitly as an action column. What currently happens is that when the status is "should disable" or "should enable" a link to the FF edit page is included for quick access. See here.

zandre-eng · 2024-07-02T14:02:57Z

Currently I do something like that, but not as explicitly as an action column. What currently happens is that when the status is "should disable" or "should enable" a link to the FF edit page is included for quick access. See here.

@Charl1996 Ah, that's perfect. Thanks for the explanation.

mkangia · 2024-07-02T16:24:32Z

a link to the FF edit page is included for quick access

I like that approach.

mkangia

Good to go once we are able to remove the pagination entirely.

Charl1996 · 2024-07-03T10:30:38Z

@mkangia

Good to go once we are able to remove the pagination entirely.

You mean from the UI?

Update:
I've been trying to do this by setting ajax_pagination and disable_pagination settings on the report, which hides the pagination spec here, however, it doesn't seem to work as the pagination still shows up.

mkangia · 2024-07-03T13:46:08Z

You mean from the UI?

@Charl1996 yes. If there is no pagination there should be no pagination on UI.

I've been trying to do this by setting ajax_pagination and disable_pagination settings on the report, which hides the pagination spec here, however, it doesn't seem to work as the pagination still shows up.

Spent a couple of mins and I could not get this to work either. let me know if you get blocked on this, I will give this a better try.

Charl1996 · 2024-07-16T10:10:29Z

@mkangia I'm having no success so far with removing the pagination, but I'm wondering if we should not complete that part in another PR so as to unblock this one. Thoughts?

mkangia · 2024-07-16T11:02:24Z

Hey @Charl1996

Got behind this this time, and looks like what we need to set is use_datatables = False to get rid of pagination.

I tried show_all_rows, disable_pagination and ajax_pagination first but pagination still showed up. Even though, new request was not going if i switched to page 2. Not sure why that was? I had 16 records and page size was 10.
I guess it still loads all data and pagination is not via ajax. Not sure where that decision happens.

Charl1996 · 2024-07-16T11:05:57Z

@mkangia

looks like what we need to set is use_datatables = False to get rid of pagination

Nice! Thanks!

mkangia · 2024-07-16T16:11:45Z

corehq/apps/hqwebapp/tasks.py

@@ -259,7 +259,7 @@ def clear_expired_oauth_tokens():
    call_command('cleartokens')


-@periodic_task(run_every=crontab(day_of_week=1))
+@periodic_task(run_every=crontab(minute=0, hour=1, day_of_week='mon'))


did the day_of_week=1 sent the email at a wrong time?

It seems like it. Apparently there was emails sent every hour or so, but this new configuration seems to have fixed it. It's weird.

mkangia · 2024-07-16T16:12:34Z

corehq/apps/hqadmin/reports.py

-    default_rows = 10
+    disable_pagination = True
+    ajax_pagination = False
+    use_datatables = False


do we need all these? I thought use_datatables was all that was needed?

@mkangia

I don't think they're technically necessary, but I thought of keeping it there to make it obvious to the developer that disabling pagination is the goal, since user_datatables doesn't really give me an indication that I'm trying to get rid of pagination. I also thought it's maybe a good idea to be explicit by setting the values to their current state.

Let me know if you think I should remove it or if it's fine to keep it.

Okay 👍

I am fine with leaving it how you have done it.

I believe the other options make pagination static, i.e all records are loaded on the UI but there is still pagination shown so there is no ajax request to fetch more records but the user still sees pages. I could see that behavior locally.

I thought of keeping it there to make it obvious to the developer that disabling pagination is the goal

I like that. 👍

Charl1996 added the product/admin Change affects admin pages only visible to super users / staff label Jun 25, 2024

Charl1996 requested review from kaapstorm, mkangia, zandre-eng and ajeety4 June 25, 2024 08:02

Charl1996 commented Jun 25, 2024

View reviewed changes

Charl1996 marked this pull request as ready for review June 25, 2024 08:08

Charl1996 requested review from orangejenny and biyeun as code owners June 25, 2024 08:08

mkangia reviewed Jun 25, 2024

View reviewed changes

Add urc data report

41c2f1a

Charl1996 added 7 commits June 27, 2024 15:19

Add ability to filter report by FF status

ada1b36

Add table data source outside of report class

190931a

Add tests for UCRRebuildRestrictionTable

f5828b2

Remove domain filter

4a7658c

Send admin email every Monday

bcd11c9

Add pagination

72e3af4

Pull all domains' forms and cases count from ES in one go

76f9c74

Charl1996 force-pushed the cs/SC-3662-ucr-load-report branch from 3869669 to 76f9c74 Compare June 27, 2024 13:31

Charl1996 requested a review from mkangia June 27, 2024 17:24

ajeety4 reviewed Jun 28, 2024

View reviewed changes

mkangia reviewed Jun 28, 2024

View reviewed changes

Charl1996 added 2 commits July 1, 2024 10:37

Rename method

0c75e90

Revert "Add pagination"

7860d98

This reverts commit 72e3af4.

zandre-eng reviewed Jul 2, 2024

View reviewed changes

corehq/apps/hqadmin/reports.py Show resolved Hide resolved

Memoize ucr_domains

bc15ccf

zandre-eng approved these changes Jul 2, 2024

View reviewed changes

mkangia approved these changes Jul 2, 2024

View reviewed changes

Update ucr email frequency

22a9297

Disable use of datatable

df2b7eb

Charl1996 requested a review from mkangia July 16, 2024 13:13

mkangia reviewed Jul 16, 2024

View reviewed changes

mkangia approved these changes Jul 17, 2024

View reviewed changes

Charl1996 merged commit e4c68a2 into master Jul 17, 2024
13 checks passed

kaapstorm deleted the cs/SC-3662-ucr-load-report branch September 13, 2024 20:50

Admin report for domains which should have data source restriction FF enabled #34813

Admin report for domains which should have data source restriction FF enabled #34813

Conversation

Charl1996 commented Jun 25, 2024 • edited Loading

Product Description

Technical Summary

Feature Flag

Safety Assurance

Safety story

Automated test coverage

QA Plan

Rollback instructions

Labels & Review

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkangia left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Charl1996 commented Jun 27, 2024

Charl1996 commented Jun 27, 2024

mkangia commented Jun 27, 2024

mkangia commented Jun 27, 2024

Charl1996 commented Jun 27, 2024

ajeety4 Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zandre-eng commented Jul 2, 2024

Charl1996 commented Jul 2, 2024

zandre-eng commented Jul 2, 2024

mkangia commented Jul 2, 2024

mkangia left a comment

Choose a reason for hiding this comment

Charl1996 commented Jul 3, 2024 • edited Loading

mkangia commented Jul 3, 2024

Charl1996 commented Jul 16, 2024

mkangia commented Jul 16, 2024

Charl1996 commented Jul 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Charl1996 commented Jun 25, 2024 •

edited

Loading

ajeety4 Jun 28, 2024 •

edited

Loading

Charl1996 commented Jul 3, 2024 •

edited

Loading