-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue #1365] Create script to setup the current opportunities table #1577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 high level questions:
- What was the motivation behind creating a task for this? it seems you should be able to calculate these values in just-in-time, instead of deriving and saving them ahead of time?
- Is the idea that these would go into a step function (or similar) cron-job?
"opportunity_id": opportunity.opportunity_id, | ||
"existing_opportunity_status": opportunity.opportunity_status, | ||
} | ||
log_extra |= get_log_extra_for_summary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the shorthand for python dictionary operations is pretty useful / fairly recent.
@@ -0,0 +1,72 @@ | |||
import abc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking comment) This is great! I feel like it would be nice-to-have for the copy oracle data task to use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is a port of an approach I used on some past backend scripts. If whatever replaces the copy-oracle commands is still run via python scripts, then we can make sure to re-use this for that purpose.
@@ -0,0 +1,520 @@ | |||
from dataclasses import dataclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐⭐ (optional but seriously consider) I hate to see this given how many tests you wrote, but... It looks like the code coverage for set_current_opportunities_task.py
is 97%. Can you check that it's not missing any important logic branches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like 3 lines.
- The "main" function for the script. I can technically get that to be called.
- A line that technically can't be hit in an if/else but makes it clearer / makes MyPy happy (null check)
- A line that I can actually add something for
I'll fix 1&3 and get it to 99%
api/tests/src/task/opportunities/test_set_current_opportunities_task.py
Outdated
Show resolved
Hide resolved
Yes, we'll want to setup a cron job for this as well (thinking 5-10 minutes?). But no urgency as this won't do anything until the transformation process is in place. As for why we do this as a script rather than live, it is for a few reasons:
|
Summary
Fixes #1365
Time to review: 10 mins
Changes proposed
Created a script that fetches all opportunities, determines what the current opportunity summary + opportunity status should be, and sets them
Added some miscellaneous backend task utilities for logging / metrics
Context for reviewers
https://app.gitbook.com/o/cFcvhi6d0nlLyH2VzVgn/s/v1V0jIH7mb7Yb3jlNrgk/engineering/learnings/opportunity-endpoint-data-model#calculating-current-summary provides the context for this change.
In simplest terms, the algorithm is actually quite simple:
Note that there are a few other small pieces in the logic, mostly to handle logging + metrics + performance. Just for performance and tracking reasons, if the summary+status are already their desired values, we don't "re-update" them, just leave them alone. If they don't match, then we'll update them accordingly, including potentially removing the current opportunity summary entirely.
Additional information
There is more work we can do regarding logging, and metrics, but this is a pretty solid start that will work well with our eventual integration to New Relic. We log information in two separate ways:
Aggregate metrics are task-level metrics that are largely counts/times. Things like "number of opportunities processed" or "duration of process". These are all handled by the new
Task
class. Anywhere you seeself.increment("something")
updates a counter on the class that is used to track metrics. These are useful for high-level tracking of metrics, and tracking trends.Right now, these aggregate metrics get logged at the end of the process, and currently look like:
Specific metrics are those that are attached to an individual log message (in this case, pertaining to a specific opportunity). This can be useful for investigating, or debugging issues with an opportunity. The log messages allow us to tell exactly what happened to an opportunity.
These look something like this (two separate opportunities that hit different scenarios):