Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify "Find It" feature to treat search terms as AND, not OR. #173

Open
eric-gilbertson opened this issue Jul 29, 2020 · 5 comments
Open
Assignees

Comments

@eric-gilbertson
Copy link
Collaborator

Currently find it returns items that contain any of the words in the user entered phrase, eg searching for "Take off my dress" returns all items that contain the word dress instead of the Ana Egge song of that name. The utility of this feature would be greatly increased if it returned only items that contained all the user entered words.

@eric-gilbertson eric-gilbertson self-assigned this Jul 29, 2020
@RocketMan
Copy link
Owner

Actually, looking at the code just now, it is already doing an AND, not an OR.

As proof, type White. See the results. Then White Horse. Notice you have fewer, not more results.

In your example, 'take off my dress', everything is a stopword except for 'dress'. If you put that phrase in double quotes, it will find just your playlists with 'take off my dress', but if you just type it as-is, you get a list of things which match 'dress'.

(I'm not sure why 'take' is being treated as a stopword; I am going to look into that further.)

So bottom line, Find It! already does AND not OR.

@RocketMan
Copy link
Owner

(I'm not sure why 'take' is being treated as a stopword; I am going to look into that further.)

'Take' is, in fact, a stopword:
https://dev.mysql.com/doc/refman/8.0/en/fulltext-stopwords.html#fulltext-stopwords-stopwords-for-myisam-search-indexes

@RocketMan
Copy link
Owner

Given that we already do AND and not OR, shall we close this?

Or do you want to keep it open and revisit the stop lists? I am gobsmacked how many words are in there, that in our case, probably should not be. Zk is not a huge database, so I don't think indexing things like 'take' or 'besides' or any of a number of words of that sort will cause us undue strain.

@RocketMan
Copy link
Owner

I have a prepared a significantly reduced stopword list and am tracking in #176

To minimize any possible disruption, I will defer restarting mysqld and rebuilding the indexes until tomorrow morning after midnight.

@RocketMan
Copy link
Owner

OK, after the ops changes of #176, you will find that your sample phrase, 'take off my dress', even WITHOUT the quotation marks now yields a much more reasonable result:

afbeelding

NOTE: Now, the only stopword in the phrase is 'my', so you do get some extra things like 'take off your dress'. (Previously, everything was a stopword except for dress).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants