Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce number of queries to read shares in a folder #34918

Closed
wants to merge 8 commits into from

Conversation

starypatyk
Copy link
Contributor

@starypatyk starypatyk commented Nov 1, 2022

While working on nextcloud/android#10783 I have found that the API call to retrieve shares for files in a folder (/ocs/v2.php/apps/files_sharing/api/v1/shares?path=xyz&reshares=true&subfiles=true) executes 6 database queries for each file.

I have modified implementation of the ShareAPIController::getSharesInDir to use OC\Share20\Manager::getSharesInDir rather than retrieving shares separately for each file in the folder.

This significantly reduces the number of database queries and improves performance. In my tests (331 files in a folder) I got the following results (most of the time is now spent in the initialization code):

_ Execution time [ms] # of DB queries
Before 980 2021
After 280 40

After implementing this change, I have noticed some changes in data returned from the API.

  • Permissions for room shares returned from RoomShareProvider::getSharesInFolder are incorrect. I have submitted a separate PR in nextcloud/spreed to fix the issue.
  • Original code handling this API call also returned information about incoming group shares. I do not know if this was an intended behaviour or an error in the original implementation. Update - after removing the uid check for re-shares mode in d0c52e5, the new code returns the same information in this case as the original one.

@starypatyk
Copy link
Contributor Author

I have been able to fix failing unit tests in 23d06ee, but I am not satisfied with this approach. 😞 I am afraid that I do not understand the approach to the unit tests well enough to do it better. Any suggestions are welcome.

On the other hand the failing integration tests seem to point to a real issue with my update. I will look into it.

@skjnldsv skjnldsv requested review from a team, icewind1991 and blizzz and removed request for a team November 2, 2022 08:09
@PVince81 PVince81 requested a review from artonge November 2, 2022 15:30
@starypatyk
Copy link
Contributor Author

starypatyk commented Nov 3, 2022

I have managed to repeat locally the first failing scenario of the integration tests.

At the moment I do not know how to fix the code. 😞

The problem is that:

A couple of questions:

  1. Should this be handled in ShareAPIController::getSharesInDir or rather in DefaultShareProvider::getSharesInFolder (and probably in other share providers as well)?
  2. Any suggestions how to efficiently retrieve required information from the database? Would it make sense to construct the database query like below? Starting from the oc_filecache table, column list simplified for clarity.
SELECT
  f.parent, f.fileid, f.path,
  s.id, s.uid_owner, s.uid_initiator, s.file_target
FROM
  oc_filecache f
  INNER JOIN oc_share s ON f.fileid = s.item_source
WHERE f.parent = folder-node-id-here;

Please advise.

Update When I looked at the code in DefaultShareProvider::getSharesInFolder right after posting the original version of this comment, I realized that the method already retrieves the data in a proper way. It just seems to filter a bit too much in "reshares" mode. I have committed experimental code in d0c52e5 to verify this. It seems however that the CI queue is quite long at the moment. Will need to wait... 😴

@starypatyk
Copy link
Contributor Author

It seems that the quick-and-dirty change in d0c52e5 fixed the failing integration tests. 😮

Do you think this approach is correct?

If yes, I will clean it up and make updates for the other review comments.

@starypatyk
Copy link
Contributor Author

I have committed small changes already requested by reviewers in c51e756

After removing the uid check for reshares mode in DefaultShareProvider::getSharesInFolder, the failing tests run OK. In the last run there is a failure in acceptance-app-files-sharing, but this one was OK just before clean-up, so I assume that this is a random error that I have seen already in the previous PR (#34471).

I do not plan any more changes unless there are additional suggestions from review.

@starypatyk starypatyk added 3. to review Waiting for reviews and removed 2. developing Work in progress labels Nov 6, 2022
@come-nc
Copy link
Contributor

come-nc commented Nov 7, 2022

I have committed small changes already requested by reviewers in c51e756

After removing the uid check for reshares mode in DefaultShareProvider::getSharesInFolder, the failing tests run OK. In the last run there is a failure in acceptance-app-files-sharing, but this one was OK just before clean-up, so I assume that this is a random error that I have seen already in the previous PR (#34471).

I do not plan any more changes unless there are additional suggestions from review.

How is the removal of uid check affecting the performance benchmark shown in your first post?
And how is this affecting the other places calling getSharesInFolder ?

@starypatyk
Copy link
Contributor Author

@come-nc

How is the removal of uid check affecting the performance benchmark shown in your first post?

Same results as previously - within error margin. The timings vary between 260 and 290 ms. Even the query count is not always the same, I see a slightly varying number of UPDATE oc_authtoken queries during the initialization sequence.

Average results from a few runs are:

_ Execution time [ms] # of DB queries
Before 980 2021
After 275 40

And how is this affecting the other places calling getSharesInFolder ?

This is a very good question. I have removed the uid check to fix the following integration tests:

--- Failed scenarios:

    /drone/src/build/integration/sharing_features/sharing-v1-part2.feature:306
    /drone/src/build/integration/sharing_features/sharing-v1-part2.feature:381
    /drone/src/build/integration/sharing_features/sharing-v1-part2.feature:406

This also fixed the issue that I have noticed at the beginning: original code handling this API call also returned information about incoming group shares. After removal of the uid check the new code behaves the same.

I suspect that there are some other scenarios when getSharesInFolder is going to return more data (shares in which specified user is neither owner, nor initiator). Unfortunately I am not able to assess when/how this is going to show.

nickvergessen added a commit to nextcloud/spreed that referenced this pull request Nov 8, 2022
Signed-off-by: Joas Schilling <coding@schilljs.com>
nickvergessen added a commit to nextcloud/spreed that referenced this pull request Nov 8, 2022
Signed-off-by: Joas Schilling <coding@schilljs.com>
@szaimen
Copy link
Contributor

szaimen commented Nov 14, 2022

Seems like the tests pass now...

@blizzz blizzz modified the milestones: Nextcloud 28, Nextcloud 29 Nov 23, 2023
@starypatyk
Copy link
Contributor Author

Hi @nickvergessen and @icewind1991,

I see a new PR #42638 touching the same area as this one - i.e. the DefaultShareProvider::getSharesInFolder method.

Could we use this as an opportunity to get back to this PR? This is pending for a long time and is quite important to improve performance of the Android app.

I will be glad to provide additional explanation and make further changes when suggested. At the moment I think that I had already responded to all questions/concerns and I have no idea what else I could do, to move this one forward.

@nickvergessen
Copy link
Member

I will bring it to the performance group and check for feedback the first week of february

@icewind1991 icewind1991 force-pushed the read_shares_in_folder branch from 0d09169 to ba32ac1 Compare February 9, 2024 12:38
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Code clean-up

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
This reverts commit 1367fc1.

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

@AndyScherzinger @icewind1991 - Is there anything I should/could do regarding this PR?

I see that one of the failing tests is 'block unconventional commits'. I can squash all the commits into one if this is going to help.

This was referenced Mar 12, 2024
This was referenced Mar 20, 2024
@skjnldsv skjnldsv mentioned this pull request Mar 28, 2024
81 tasks
@skjnldsv skjnldsv modified the milestones: Nextcloud 29, Nextcloud 30 Mar 28, 2024
@skjnldsv skjnldsv added 2. developing Work in progress stale Ticket or PR with no recent activity and removed 3. to review Waiting for reviews labels Jul 27, 2024
This was referenced Jul 30, 2024
@skjnldsv skjnldsv closed this Aug 3, 2024
@skjnldsv skjnldsv removed this from the Nextcloud 30 milestone Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. developing Work in progress feature: sharing performance 🚀 stale Ticket or PR with no recent activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants