-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up Datastore deletes by batch deletions with multithreading #2182
Conversation
Signed-off-by: Pamela Toman <ptoman@paloaltonetworks.com>
Hi @ptoman-pa. Thanks for your PR. I'm waiting for a feast-dev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -32,6 +35,9 @@ | |||
from feast.protos.feast.types.Value_pb2 import Value as ValueProto | |||
from feast.repo_config import FeastConfigBaseModel, RepoConfig | |||
from feast.usage import log_exceptions_and_usage, tracing_span | |||
from feast.utils.generic_utils import AtomicCounter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've generally tried to avoid adding utils
modules since it's a kitchen sink/code smell. If this AtomicCounter
is only used in this datastore class, maybe we can just define it here? Or rename feast.utils.generic_utils
to feast.atomic
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually glad to hear this :) I wanted to use the patterns of the codebase. I'm going to move it into an innerclass just for the _delete_all_values
since it's only used for that logging statement (or we could remove the logging), and it can always be migrated to increasing scope if needed.
while True: | ||
client.delete_multi(deletion_queue.get()) | ||
shared_counter.increment() | ||
LOGGER.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should maybe be .debug
?
Signed-off-by: Pamela Toman <ptoman@paloaltonetworks.com>
Codecov Report
@@ Coverage Diff @@
## master #2182 +/- ##
==========================================
+ Coverage 84.59% 84.64% +0.04%
==========================================
Files 102 102
Lines 8186 8230 +44
==========================================
+ Hits 6925 6966 +41
- Misses 1261 1264 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
9679440
to
67c64cf
Compare
with self.lock: | ||
self.value += 1 | ||
|
||
BATCH_SIZE = 500 # Dec 2021: delete_multi has a max size of 500 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit is there a doc we can link to in case this changes in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went looking and couldn't find an obvious place. If we try to query with more than 500, the server responds:
google.api_core.exceptions.InvalidArgument: 400 cannot write more than 500 entities in a single call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Pamela Toman <ptoman@paloaltonetworks.com>
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: achals, ptoman-pa The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Speeds up teardown in Datastore by 3-4 orders of magnitude via Datastore's
delete_multi
plus multithreading with a work queue. (Previous approach was one-request-per-element single-threaded deletions.)Which issue(s) this PR fixes:
No issue
Does this PR introduce a user-facing change?: