Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SystemTags endpoint to return tags used by a user with meta data #37961

Merged
merged 6 commits into from
May 11, 2023

Conversation

blizzz
Copy link
Member

@blizzz blizzz commented Apr 27, 2023

Summary

Target case is photos app: when visiting the tags category, all systemtags of the whole cloud are retrieved. In subsequent steps the next tag is requested until the browser view is filled with tag tiles (i.e. previews are requested just as well).

With this approach, we incorporate the dav search and look for user related tags that are used by them, and already returns the statistics (number of files tagged with the respective tag) as well as a file id for the purpose to load the preview. This defaults to the file with the highest id.

Call:

curl -s -u 'user:password' \
  'https://my.nc.srv/remote.php/dav/systemtags-assigned' \
  -X PROPFIND -H 'Accept: text/plain' \
  -H 'Accept-Language: en-US,en;q=0.5'  -H 'Depth: 1' \
  -H 'Content-Type: text/plain;charset=UTF-8' \
  --data @/home/doe/request-systemtag-props.xml

With request-systemtag-props.xml:

<?xml version="1.0" encoding="UTF-8"?>
<d:propfind xmlns:d="DAV:">
        <d:prop xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns">
                <oc:id/>
                <oc:display-name/>
                <oc:user-visible/>
                <oc:user-assignable/>
                <oc:can-assign/>
                <nc:files-assigned/>
                <nc:reference-fileid/>
        </d:prop>
</d:propfind>

Example output:

  …
  <d:response>
    <d:href>/master/remote.php/dav/systemtags/84</d:href>
    <d:propstat>
      <d:prop>
        <oc:id>84</oc:id>
        <oc:display-name>Computer</oc:display-name>
        <oc:user-visible>true</oc:user-visible>
        <oc:user-assignable>true</oc:user-assignable>
        <oc:can-assign>true</oc:can-assign>
        <nc:files-assigned>42</nc:files-assigned>
        <nc:reference-fileid>924022</nc:reference-fileid>
      </d:prop>
      <d:status>HTTP/1.1 200 OK</d:status>
    </d:propstat>
  </d:response>
  <d:response>
    <d:href>/remote.php/dav/systemtags/97</d:href>
    <d:propstat>
      <d:prop>
        <oc:id>97</oc:id>
        <oc:display-name>Bear</oc:display-name>
        <oc:user-visible>true</oc:user-visible>
        <oc:user-assignable>true</oc:user-assignable>
        <oc:can-assign>true</oc:can-assign>
        <nc:files-assigned>1</nc:files-assigned>
        <nc:reference-fileid>923422</nc:reference-fileid>
      </d:prop>
      <d:status>HTTP/1.1 200 OK</d:status>
    </d:propstat>
  </d:response>
  …

Update 23/05/04

Added was support to retrieve tags for a specific media type. This works with the URI. So, while the systemtags-current root provides tag information across all files, querying /remote.php/dav/systemtags-current/image will return tag information only for all image-typed files.

In the first commit, the image-type condition was hard coded.

/Update

Update 23/05/04

The endpoint was renamed from systemtags-current to systemtags-assigned. The information at the top (before the updates) are adjusted.

/Update

TODO

  • Performance testing with a serious data set
  • Reconsider pragmatic architectual choices (quick PoC)

Checklist

apps/dav/lib/SystemTag/SystemTagPlugin.php Fixed Show fixed Hide fixed
apps/dav/lib/SystemTag/SystemTagPlugin.php Fixed Show fixed Hide fixed
apps/dav/lib/SystemTag/SystemTagPlugin.php Fixed Show fixed Hide fixed
apps/dav/lib/SystemTag/SystemTagsInUseCollection.php Fixed Show resolved Hide resolved
lib/private/Files/Node/Folder.php Fixed Show resolved Hide resolved
apps/dav/lib/SystemTag/SystemTagPlugin.php Fixed Show fixed Hide fixed
@blizzz
Copy link
Member Author

blizzz commented May 4, 2023

Added support for media types and updated the information in the summary.

Compared to the first commit, it's a little breaking, for filtering image files was hardcoded. To achive the same results run the query against /remote.php/dav/systemtags-current/image, otherwise tags relevant to any file type are returned.

@marcelklehr
Copy link
Member

Can we call the endpoint something more meaningful than systemtags-current?

Copy link
Member

@AndyScherzinger AndyScherzinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine by me except other peoples comments of course. And would also love to have this in for 27 even if we might optimize further in the future or have a complete new solution for certain things regarding tags.

Also, given the time I think Beta2 would be fine.

@blizzz
Copy link
Member Author

blizzz commented May 4, 2023

Can we call the endpoint something more meaningful than systemtags-current?

We can, but do you have suggestions? Tbh, I am not super happy about it either, but was so far the best I could think of:

  • "-current" says it is about tags that are in use at this time

And i disregarded following ideas, because

  • "-used" could be appropriate, but it has a touch of the past, or could be misinterpreted as something trashed
  • "-recent" could suggest that valid, but older tags would not be considered
  • "-active" could be misleading (what is inactive?)
  • "-user" could be misleading to be user-only tags

@blizzz
Copy link
Member Author

blizzz commented May 4, 2023

fine by me except other peoples comments of course. And would also love to have this in for 27 even if we might optimize further in the future or have a complete new solution for certain things regarding tags.

Also, given the time I think Beta2 would be fine.

Beta2 is fine, just one thing that is missing is also that we extend Folder with public methods, and that should be reflected in the PHP API. We should refrain from touching \OCP\Files\Folder as it has at least 8 implementations, which it would break. So we'll have to add a public interface, but when we do it, we should be sure about it.

Thus said I am not one hundred percent convinced our additions should stay in the Folder implementation, however there is some of the logic we need. We could have a SytemTag specific collection that could wrap it and so have little changes there, but would have a hard dependency on the Folder implementation (if only on few specific parts).

Or we expose what is consolidated here as getCachesAndMountpointsForSearch as PHP API, that can be then consumed by new SystemTag specific classes and have better separation.

Would love to have @icewind1991​'s opinion on it.

@marcelklehr
Copy link
Member

We can, but do you have suggestions?

How about

  • systemtags-in-use
  • systemtags-assigned

@blizzz
Copy link
Member Author

blizzz commented May 4, 2023

We can, but do you have suggestions?

How about

* `systemtags-in-use`

* `systemtags-assigned`

systemtags-assigned sounds good to me, and yes, clearer than current.

Copy link
Member

@icewind1991 icewind1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach seems fine.

Not a big fan of adding it as a public method to folder but not sure if there is a better option

apps/dav/lib/SystemTag/SystemTagsInUseCollection.php Outdated Show resolved Hide resolved
lib/private/Files/Node/Folder.php Outdated Show resolved Hide resolved
// Currently query has to have exactly one search condition. If no media type is provided,
// we fall back to the presence of a systemtag.
if (empty($mediaType)) {
$query = $this->queryFromOperator(new SearchComparison(ISearchComparison::COMPARE_LIKE, 'systemtag', '%'), null, $limit, $offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does adding this operator do anything? LIKE '%' seems to be always true to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shortcut to get a query to pass to $searchHelper->findUsedTagsInCaches(). Certainly this can be solved more elegantly. It was born out of the need to have something ready to bring Magenta back to a working state/environment.

Copy link
Member

@icewind1991 icewind1991 May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that this might interfere with picking an index so it would be better to have another way to flag that the query should join on tags.

Copy link
Member Author

@blizzz blizzz May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour appears same, however (checking with Postgrest) at some we'll have an Index Scan using oc_systemtag_pkey on oc_systemtag systemtag (cost=0.14..13.54 rows=69 width=20) where the poential cost is slighlty lower when the comparison is not included. 13.54 instead of 13.72. May not be representative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with latest commit

@blizzz
Copy link
Member Author

blizzz commented May 4, 2023

Not a big fan of adding it as a public method to folder but not sure if there is a better option

@icewind1991 yes feeling the same. One way to reduce the footprint would be to expose getCachesAndMountpointsForSearch only (new interface like ISearchAwareFolder or something like that), then the rest of the logic can be moved to classes withing the SystemTags namespace.

@blizzz
Copy link
Member Author

blizzz commented May 5, 2023

Not a big fan of adding it as a public method to folder but not sure if there is a better option

@icewind1991 yes feeling the same. One way to reduce the footprint would be to expose getCachesAndMountpointsForSearch only (new interface like ISearchAwareFolder or something like that), then the rest of the logic can be moved to classes withing the SystemTags namespace.

But of course that leaves the changes within OC\Files\Cache\. I have repressed it somehow, and from the hip not a quick solution for it. I've got a commit ready to do the former that I can push, if it makes a big difference.

Or we indeed keep it provide as it is for now, just to have it merged. And have time to rethink internal structure and the PHP API, and be it for 28. I am a bit concerned though that the follow up would compete with other important tasks.

This was referenced May 9, 2023
@blizzz blizzz force-pushed the poc/noid/systemtags-perf branch from e37f853 to ab3bba2 Compare May 9, 2023 19:00
blizzz added 5 commits May 9, 2023 23:51
Target case is photos app: when visiting the tags category, all systemtags
of the whole cloud are retrieved. In subequent steps the next tag is
requested until the browser view is filled with tag tiles (i.e. previews
are requested just as well).

With this approach, we incorpoate the dav search and look for user related
tags that are used by them, and already returns the statistics (number of
files tagged with the respective tag) as well as a file id for the purpose
to load the preview. This defaults to the file with the highest id.

Call:
curl -s -u 'user:password' \
  'https://my.nc.srv/remote.php/dav/systemtags-current' \
  -X PROPFIND -H 'Accept: text/plain' \
  -H 'Accept-Language: en-US,en;q=0.5'  -H 'Depth: 1' \
  -H 'Content-Type: text/plain;charset=UTF-8' \
  --data @/home/doe/request-systemtag-props.xml

With request-systemtag-props.xml:
<?xml version="1.0" encoding="UTF-8"?>
<d:propfind xmlns:d="DAV:">
        <d:prop xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns">
                <oc:id/>
                <oc:display-name/>
                <oc:user-visible/>
                <oc:user-assignable/>
                <oc:can-assign/>
                <nc:files-assigned/>
                <nc:reference-fileid/>
        </d:prop>
</d:propfind>

Example output:
  …
  <d:response>
    <d:href>/master/remote.php/dav/systemtags/84</d:href>
    <d:propstat>
      <d:prop>
        <oc:id>84</oc:id>
        <oc:display-name>Computer</oc:display-name>
        <oc:user-visible>true</oc:user-visible>
        <oc:user-assignable>true</oc:user-assignable>
        <oc:can-assign>true</oc:can-assign>
        <nc:files-assigned>42</nc:files-assigned>
        <nc:reference-fileid>924022</nc:reference-fileid>
      </d:prop>
      <d:status>HTTP/1.1 200 OK</d:status>
    </d:propstat>
  </d:response>
  <d:response>
    <d:href>/remote.php/dav/systemtags/97</d:href>
    <d:propstat>
      <d:prop>
        <oc:id>97</oc:id>
        <oc:display-name>Bear</oc:display-name>
        <oc:user-visible>true</oc:user-visible>
        <oc:user-assignable>true</oc:user-assignable>
        <oc:can-assign>true</oc:can-assign>
        <nc:files-assigned>1</nc:files-assigned>
        <nc:reference-fileid>923422</nc:reference-fileid>
      </d:prop>
      <d:status>HTTP/1.1 200 OK</d:status>
    </d:propstat>
  </d:response>
  …

Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
- only the media part of the mime type can be search, but not the full
  mime type. It can be added, should it become necessary.
- thus fixes previously hardcoded selector for image/ types
- also fixes a return type hint
- adds a return type hint

Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
- DI SystemTagManager
- add some comments and doc
- catch NoUserException
- add return type hints

Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
- adds OC\SystemTag\SystemTagsInFilesDetector where the search logic is
  moved to

Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
@blizzz blizzz force-pushed the poc/noid/systemtags-perf branch from ab3bba2 to dbfd2f9 Compare May 9, 2023 21:59
@blizzz blizzz added 3. to review Waiting for reviews and removed 2. developing Work in progress labels May 9, 2023
@blizzz blizzz requested a review from icewind1991 May 9, 2023 22:00
@blizzz
Copy link
Member Author

blizzz commented May 9, 2023

@icewind1991 moved it to the search helper as suggested

@blizzz blizzz changed the title PoC: SystemTags endpoint to return tags used by a user with meta data SystemTags endpoint to return tags used by a user with meta data May 9, 2023
- search constraints are now fully in control of
  SystemTagsInFilesDetector::detectAssignedSystemTagsIn(), avoids
  duplication of a WHERE statement

Signed-off-by: Arthur Schiwon <blizzz@arthur-schiwon.de>
@blizzz blizzz force-pushed the poc/noid/systemtags-perf branch from 1891d1e to df662f5 Compare May 10, 2023 16:48
@blizzz
Copy link
Member Author

blizzz commented May 11, 2023

failing drone CI tests unrelated

@blizzz
Copy link
Member Author

blizzz commented May 11, 2023

Will need manual backports to 26 and 25

… although 26 might work, trying.

@blizzz blizzz merged commit b6c034a into master May 11, 2023
@blizzz blizzz deleted the poc/noid/systemtags-perf branch May 11, 2023 08:16
@blizzz
Copy link
Member Author

blizzz commented May 11, 2023

/backport to stable26

@AndyScherzinger
Copy link
Member

/backport to stable25

@backportbot-nextcloud
Copy link

The backport to stable26 failed. Please do this backport manually.

# Switch to the target branch and update it
git checkout stable26
git pull origin/stable26

# Create the new backport branch
git checkout -b fix/foo-stable26

# Cherry pick the change from the commit sha1 of the change against the default branch
# This might cause conflicts. Resolve them.
git cherry-pick abc123

# Push the cherry pick commit to the remote repository and open a pull request
git push origin fix/foo-stable26

More info at https://docs.nextcloud.com/server/latest/developer_manual/getting_started/development_process.html#manual-backport

@backportbot-nextcloud
Copy link

The backport to stable25 failed. Please do this backport manually.

# Switch to the target branch and update it
git checkout stable25
git pull origin/stable25

# Create the new backport branch
git checkout -b fix/foo-stable25

# Cherry pick the change from the commit sha1 of the change against the default branch
# This might cause conflicts. Resolve them.
git cherry-pick abc123

# Push the cherry pick commit to the remote repository and open a pull request
git push origin fix/foo-stable25

More info at https://docs.nextcloud.com/server/latest/developer_manual/getting_started/development_process.html#manual-backport

@blizzz
Copy link
Member Author

blizzz commented May 16, 2023

backports created manually

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants