Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce number of queries required to read shares #7743

Merged
merged 11 commits into from
Aug 24, 2022
Merged

Reduce number of queries required to read shares #7743

merged 11 commits into from
Aug 24, 2022

Conversation

starypatyk
Copy link
Contributor

When diagnosing slow loading of chat messages in the Talk Android app, I have noticed that information about files (usually images) shared in the chat is read using a separate query for each file.

This is an initial proof-of-concept that reduces it to a single query.

Instead of executing a separate query in RoomShareProvider::getShareById for each share, we gather all share ids and execute a single query in a new method RoomShareProvider::getSharesByIds.

While performance improvements in the development environment are difficult to notice, on my family "production" environment this gives around 15% improvement of the ChatController::receiveMessages execution time (280 ms down from 330 ms).

This code is certainly not production-ready and requires refactoring/clean-up. However I would like to get some feedback, if this looks like a reasonable way to go forward.

Trying to read info about all shared items together also seems to be a pre-requisite to eventually optimize SystemMessage::getFileFromShare. Currently additional 3-4 database queries are performed for each shared file (by the Files app if I understand correctly).

Thoughts?

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@nickvergessen
Copy link
Member

Can you give some rough numbers about your chats? Are they mostly files only? On our company instance there are roughly 7k files shared between 2m chat messages. So the cases where so many files are in a single chunk that duplicating the method is worth the improvements are rather limited

@starypatyk
Copy link
Contributor Author

Hi @nickvergessen,

My "production" Nextcloud instance is a family server used primarily to share photos, so images are very frequently shared in chats. The server runs for over 8 years now (it started as Owncloud), but we use Talk for only about 1.5 year.

The statistics are:

  • ~2200 total messages.
  • ~420 files shared (SELECT COUNT(*) FROM oc_share WHERE share_type = 10 -- SHARE_TYPE_ROOM) - but please note that many of these are shared in group chats.

In two sample chats I have 33 and 40 images in the last 100 messages.

On the other hand, in my day job (where we use Teams), I see that we also send a lot of images (primarily screenshots). As I understand the Handle pasting screenshots to the chat message form feature also uses files sharing. Same applies to Image keyboard support in chat that I had implemented at the beginning of this year in the Android client - to enliven the conversations. 😉

Maybe I am heavily biased, but for me images are an important part of chat conversations.

@nickvergessen
Copy link
Member

Maybe I am heavily biased, but for me images are an important part of chat conversations.

Totally fine, we only have a Meme channel with a that high rate.
Since we have a performance related workgroup this week your changes came just in time.
I'll have a look at it during the week I guess.

@nickvergessen nickvergessen self-assigned this Aug 15, 2022
@starypatyk
Copy link
Contributor Author

Thanks @nickvergessen!

Please note that the code is currently of "proof-of-concept" quality.

I have not been able to find a way to gather all share IDs using the existing infrastructure (MessageParser::parseMessage -> Listener::parseSystemMessage -> SystemMessage::parseMessage -> SystemMessage::getFileFromShare). That's why there is a hack at the top level method (ChatController::receiveMessages) going directly to the RoomShareProvider.

@nickvergessen
Copy link
Member

I deployed it on our testing copy for now

Before After
170 158

so yeah at the moment it is not helping too much.
Will look if we can bring the info collecting on a further level

@@ -631,6 +635,11 @@ public function getSharesBy($userId, $shareType, $node, $reshares, $limit, $offs
* @throws ShareNotFound
*/
public function getShareById($id, $recipientId = null): IShare {

if (isset($this->sharesByIdCache[$id])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache needs to be invalidated on delete/update function calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added cache invalidation (cleaning actually 😉) in all update/delete calls that seemed related in 03f6476.

lib/Share/RoomShareProvider.php Outdated Show resolved Hide resolved
@nickvergessen
Copy link
Member

That's why there is a hack at the top level method (ChatController::receiveMessages) going directly to the RoomShareProvider.

I would say that is even okay, could even make the function name reflect that: preloadSharesByIds or something alike

lib/Controller/ChatController.php Outdated Show resolved Hide resolved
lib/Controller/ChatController.php Outdated Show resolved Hide resolved
lib/Share/RoomShareProvider.php Outdated Show resolved Hide resolved
lib/Share/RoomShareProvider.php Outdated Show resolved Hide resolved
@nickvergessen
Copy link
Member

With those 2 PRs in addition our Meme channel is down to 69 queries:

starypatyk and others added 3 commits August 17, 2022 20:57
Co-authored-by: Joas Schilling <213943+nickvergessen@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <8277636+starypatyk@users.noreply.github.com>
(Nuclear option - clean the cache entirely)

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

@nickvergessen - thanks for your comments. I have already applied most of your suggestions.

Regarding the remaining suggestion:

That's why there is a hack at the top level method (ChatController::receiveMessages) going directly to the RoomShareProvider.

I would say that is even okay, could even make the function name reflect that: preloadSharesByIds or something alike

I would like to reduce code duplication between current getShareById and getSharesByIds methods. When looking at these more carefully, I have noticed that my code has two errors:

  • does not check for shares referring to deleted files - isAccessibleResult
  • may skip resolveSharesForRecipient

I am still working on this.

Also fix handling of inaccessible shares

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

@nickvergessen - I have refactored getShareById/getSharesByIds in e8bd174.

I hope that I have fixed the issues with inaccessible shares and resolveSharesForRecipient.

Could you check this please? The code became a bit convoluted and I am far from saying that I have checked all possible scenarios.

@nickvergessen
Copy link
Member

Looks good.

If you have troubles satisfying the CI jobs we can also take over at the end.

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Psalm uncovered a logic error regarding passing $shares array
to resolveSharesForRecipient :-)

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

I have managed to fix all failing tests.

I am wondering, if it makes sense to write an additional test that would check that the oc_share table is only read once during execution of the ChatController::receiveMessages method. For sure, I do not understand the test/CI infrastructure well enough to do it myself. 😉

@starypatyk starypatyk marked this pull request as ready for review August 22, 2022 21:40
@nickvergessen
Copy link
Member

I am wondering, if it makes sense to write an additional test that would check that the oc_share table is only read once during execution of the ChatController::receiveMessages method. For sure, I do not understand the test/CI infrastructure well enough to do it myself. 😉

I kind of dislike unit tests on that level.
I would leave this as a todo for a follow up. My idea would be to add it to https://github.com/nextcloud/spreed/tree/master/tests/integration/features/scaling

Where we have some tests were we run locally with total query counting enabled and that would then reveal an overall picture.

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

@nickvergessen - I do not plan any more changes - unless you have additional comments. Please feel free to merge.

Signed-off-by: Joas Schilling <coding@schilljs.com>
Copy link
Member

@nickvergessen nickvergessen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some last doc fixes, now lets wait for CI and then merge

@nickvergessen nickvergessen merged commit d942431 into nextcloud:master Aug 24, 2022
@nickvergessen
Copy link
Member

Thanks a lot @starypatyk for the patch and the good analysis!

@starypatyk starypatyk deleted the enh/noid/optimize_share_reading branch September 8, 2022 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants