-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timing metrics for case seach #34373
Conversation
cf0358b
to
fcfec6d
Compare
fcfec6d
to
7879d57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I focused more on the non-admin page related changes. But this is super cool! Just one question
return results | ||
|
||
|
||
@time_function() | ||
def get_related_cases_result(helper, app, case_types, source_cases, include_all_related_cases): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could either result in _get_all_related_cases
OR _get_search_detail_path_defined_cases
and _get_child_cases_referenced_in_app
. Would it also be helpful to get the label the timing context for those parts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - those can all be given the decorator, so I might as well, huh? Just pushed a commit adding a few more.
I haven't looked at the code but I definitely support improving our visibility into core platform components like case search. I can imagine this being generally useful in future, even if we aren't specifically focusing on case search performance, though it would be more useful if app builders could somehow get access to this info (like having a debug mode in web apps or something) - but that's getting a bit off topic. One thing I notice from the screenshot is that there's no indication of how many times a function was called. For queries that use subcase / ancestor functions it would be useful to have that split out of get some insight into it. I also wonder if being able to see the actual ES queries that were executed would ever be useful. Just some thought.... |
self.searchButtonIcon = ko.observable("fa fa-search"); | ||
self.exampleRequestDict = JSON.stringify({ | ||
"request_dict": { | ||
"_xpath_query": ["name: 'Ethan'", "subcase-exists(parent, name='Matilda')"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
</div> | ||
</div> | ||
{% if query_additions %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that references to this were removed here: #28278
</div> | ||
<button class="btn btn-default" id="add-xpath" data-bind="click: addXPath"> | ||
<i class="fa fa-plus"></i> {% trans "Add XPath" %} | ||
</button> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes seem unrelated to the profiling, are they just a bonus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I was initially hoping to reuse the existing case search tool (which does profile ES queries), so I added this to be able to specify the types of queries we commonly see (with more than one CSQL expression). Ended up going in a different direction with the new page though once I realized the changes I needed were too fundamental to work here.
return { | ||
key: (value if isinstance(value, list) else [value]) | ||
for key, value in params.items() if value is not None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do none of the uses rely on any of the methods of MultiValueDict
e.g. getlist
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it's only used in one place, which converts it to a normal dict. I moved that coercion up the stack to make it easier to initialize in other contexts.
commcare-hq/corehq/apps/case_search/models.py
Lines 199 to 200 in ef722b0
def extract_search_request_config(request_dict): | |
params = dict(request_dict.lists()) |
That would be really cool. We could also store the timing information of real requests over some threshold and view it like a report later - that'd be super helpful.
I agree with both of these - I'd like to explore that next, though I think they'll be a little harder, as you need access to the |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
Product Description
Admin-only view to see case search performance breakdown. Works best with the
LongCaseSearchRequest
log in sentryTechnical Summary
This initializes a
TimingContext
object when executing a case search request and keeps track of various sections of the process. Turns out at least in some instances it's not just the ES query that's slow, but also building the query (and executing secondary queries), wrapping cases, serializing cases, fetching an app (seems like that shouldn't be happening here), and enriching the data with related cases. I want to see what this looks like on production, since the circumstances there are pretty different than staging.The timing code is not necessary in normal case search queries, only when initiating through this admin view, but I don't know of a better way to do this, and I can't imagine this having an appreciable performance impact. It does add a bit of boilerplate to the code, which I don't like, and I'd support removing this when we are no longer as focused on case search performance.
Feature Flag
Safety Assurance
Safety story
The parts of this that involve production code is relatively small, and that's the critical part. The bulk of the changeset is admin-only, and I'm not concerned about errors there. This has been on staging for about a week and I've tested it there without issue.
Automated test coverage
QA Plan
Rollback instructions
Labels & Review