-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve user in-guild sync process #165
Conversation
05e7053
to
b8a6198
Compare
Shouldn't the condition be `if user.id _not_ in guild_member_ids`, since
`guild_member_ids` contains the users that are on the guild? From my
understanding this currently sets `in_guild=False` for all users that
are on the guild (and found in the queryset).
Also, nitpicking but it would be nice to remove the space before `%d` in
the `log.info` line.
|
c298d9a
to
e47b967
Compare
Previously we set all users in_guild to False, and relied on users being set back to in_guild when iterating through guild.members However, this caused two problems 1. For a short window a users in_guild status was incorrect 2. It required an update for all users in_guild to be sent to postgres to update in_guild back to True. This diff changes that, so instead only users who are not found in the guild have in_guild set to False. The bottleneck for this query is the number of users that are currently in_guild=False. Testing locally, with 360k users off guild, this took 7.4s to query out, and 0.5s to process & 15.1 s to commit.
e47b967
to
c2fe170
Compare
Yeah, should have marked this as draft as there were a few more things I wanted to do. I've pushed some new commits and updated the PR description with perf stats. |
This means the other, larger, columns do not need to be deserialised, bringing the query time down from 7.4s down to 3.5s. I couldn't simply do select(models.User.id) here, as we need the full User object in order to mutate and update it later inthe process.
c2fe170
to
db22e38
Compare
Thanks, this looks good now!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chris, this is fantastic, and changes everything. Thank you for everything you do.
Closes #157
Previously we set all users in_guild to False, and relied on users being set back to in_guild when iterating through guild.members
However, this caused two problems
This diff changes that, so instead only users who are not found in the guild have in_guild set to False.
The bottleneck for this process currently is the number of users that need to be committed back to the database.
Testing locally, with 380k users off guild, the perf logging outputs:
in_guild sync: total time 19.140096s, query 3.486625s, processing 0.540042s, commit 15.113428s
. The commit timing isn't reliabale, as it is dependent on how long it takes for control to be passed back after awaiting