Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changed query on tag search for speedup related to issues #112 and #141 #142

Merged
merged 5 commits into from
Aug 15, 2021

Conversation

ulixxe
Copy link
Contributor

@ulixxe ulixxe commented Aug 14, 2021

I tested this change to speed up tag search.
BR

Fixes #112

@sissbruecker
Copy link
Owner

sissbruecker commented Aug 14, 2021

Thanks, looks promising. There were test failures, because the new bookmarks queryset used in the intersection was not filtered by the owner. I also changed the code to apply the intersection only if there actually are tags in the query.

I also added some new tests, and fixed some existing tests that check if the query result is empty if the query does not match any bookmarks. As part of that one test started failing: https://github.com/sissbruecker/linkding/pull/142/files#diff-a7e2eb730dc93148ef51f07d9bd7f83f80e941b66e68c01552beee09f65bc4dfR373

Looks like the test also fails in master, I'll try to take a look what the issue is here.

BTW you can run tests with python manage.py test

@sissbruecker
Copy link
Owner

Yeah, the solution is not correct (but neither was the previous one 🙂).

We have two query sets:

  • A: tags that have bookmarks that match all the search terms
  • B: tags that have bookmarks that match all the searched tags

And then create an intersection of A and B. The problem is that A might contain results that don't match the searched tags, and B might contain results that don't match the searched terms. If both of these results contain the same tag, it could lead to returning a tag, even through there isn't a bookmark that matches the whole search query, terms + tags combined (there shouldn't be any tags if no bookmark matches the query).

But your solution gave me an idea to re-use the bookmark query as an IN filter when querying the tags. The tag query functions then look like this:

def query_bookmark_tags(user: User, query_string: str) -> QuerySet:
    bookmarks_query = query_bookmarks(user, query_string)

    query_set = Tag.objects.filter(bookmark__in=bookmarks_query)

    return query_set.distinct()


def query_archived_bookmark_tags(user: User, query_string: str) -> QuerySet:
    bookmarks_query = query_archived_bookmarks(user, query_string)

    query_set = Tag.objects.filter(bookmark__in=bookmarks_query)

    return query_set.distinct()

That solutions seems to be correct, and the performance also seems good. Maybe you can give this a try and check how the performance is in your case?

@ulixxe
Copy link
Contributor Author

ulixxe commented Aug 14, 2021

Hi,
Yes, it works. The performances are very good!
Thank you!

@sissbruecker sissbruecker merged commit 048a8b1 into sissbruecker:master Aug 15, 2021
@ulixxe ulixxe deleted the fast_tag_query branch August 15, 2021 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

query with multiple hashtags very slow
2 participants