Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make --event-time-start and --event-time-end mutually required #10878

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

QMalcolm
Copy link
Contributor

Resolves #10874

Problem

--event-time-start and --event-time-end were not mutually required, which led to some weird/bad behavior if only one was specified

Solution

Make it so that one being specified necessitates the other.

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

In the next commit we'll be adding a validation that requires that `--event-time-start`
and `--event-time-end` are mutually required. That is, whenever one is specified,
the other is required. In that world, `--event-time-start` will never need to be compared
against the "current" time, because it'll never be run in conjunction with the "current"
time.
@QMalcolm QMalcolm requested a review from a team as a code owner October 17, 2024 21:57
@cla-bot cla-bot bot added the cla:yes label Oct 17, 2024
@@ -362,17 +361,26 @@ def _validate_event_time_configs(self) -> None:
getattr(self, "EVENT_TIME_END") if hasattr(self, "EVENT_TIME_END") else None
)

if event_time_start is not None and event_time_end is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably easier to see what changed if viewing commit by commit. Git didn't handle displaying the changes made across the 2 commits very gracefully.

Copy link

codecov bot commented Oct 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.20%. Comparing base (7920b0e) to head (917d8d8).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10878      +/-   ##
==========================================
+ Coverage   89.15%   89.20%   +0.04%     
==========================================
  Files         183      183              
  Lines       23443    23441       -2     
==========================================
+ Hits        20901    20910       +9     
+ Misses       2542     2531      -11     
Flag Coverage Δ
integration 86.58% <100.00%> (+0.14%) ⬆️
unit 62.11% <20.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 62.11% <20.00%> (-0.01%) ⬇️
Integration Tests 86.58% <100.00%> (+0.14%) ⬆️

…art/end` reqs

We made it such that when `event_time_start` is specified, `event_time_end` must also
be specified (and vice versa). This broke numerous tests, in a few different ways:

1. There were tests that used `--event-time-start` without `--event-time-end` butg
were using event_time_start essentially as the `begin` time for models being initially
built or full refreshed. These tests could simply drop the `--event-time-start` and
instead rely on the `begin` value.

2. There was a test  that was trying to load a subset of the data _excluding_ some
data which would be captured by using `begin`. In this test we added an appropriate
`--event-time-end` as the `--event-time-start` was necessary to statisfy what the
test was testing

3. There was a test which was trying to ensure that two microbatch models would be
given the same "current" time. Because we wanted to ensure the "current" time code
path was used, we couldn't add `--event-time-end` to resolve the problem, thus we
needed to remove the `--event-time-start` that was being used. However, this led to
the test being incredibly slow. This was resolved by switching the relevant microbatch
models from having `batch_size`s of `day` to instead have `year`. This solution should
be good enough for roughly ~40 years? We'll figure out a better solution then, so see ya
in 2064. Assuming I haven't died before my 70th birthday, feel free to ping me to get
this taken care of.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Make --event-end-time require --event-start-time and vice versa
1 participant