Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildcard doesn't match an empty string "" #370

Open
Benjamin-Dobell opened this issue Aug 18, 2018 · 3 comments
Open

Wildcard doesn't match an empty string "" #370

Benjamin-Dobell opened this issue Aug 18, 2018 · 3 comments

Comments

@Benjamin-Dobell
Copy link

Benjamin-Dobell commented Aug 18, 2018

Wildcard presently does not match an empty (zero length) string, where as it's reasonable to expect that it does.

e.g.

Searching for blue* will not match an indexed document with the text "The sky is blue". However, searching for blue, will correctly return a match.

This is problematic as it typically necessitates two queries where one ought to suffice e.g.

const idSet = new Set()

const words = searchText.trim().split(/ |\-/)
const components = words.filter(w => w.length > 1).map(w => w.toLowerCase())

// We need to do what is *essentially* the same query twice, as Lunr's wildcard support is little unusual 

textIndex.query(query => {
	components.forEach((component, index) => {
		if (index === words.length - 1) {
			query.term(component, {wildcard: lunr.Query.wildcard.TRAILING, presence: lunr.Query.presence.REQUIRED})
		} else {
			query.term(component, {presence: lunr.Query.presence.REQUIRED})
		}
	})
}).forEach(match => idSet.add(Number(match.ref)))

textIndex.query(query => {
	components.forEach((component) => {
		query.term(component, {presence: lunr.Query.presence.REQUIRED})
	})
}).forEach(match => idSet.add(Number(match.ref)))

This also leads to the overhead of having a separate Set to remove duplicate match results.

The problem is somewhat exacerbated by the lack of sub-query support (mentioned in the last comment in #264). If we had sub-queries (OR/ADD) then the existing wildcard behaviour would be passable as we could just do something like:

query.or(
	query.term(component, {presence: lunr.Query.presence.REQUIRED}),
	query.term(component, {wildcard: lunr.Query.wildcard.TRAILING, presence: lunr.Query.presence.REQUIRED})
)

As it stands, you instead need to perform an entirely separate query, with just the last search term altered, in order to correctly match user input as it's typed.

@olivernn
Copy link
Owner

Wildcard matching an empty string seems entirely reasonable. The current behaviour is more likely an omission rather than a deliberate choice.

I need to investigate another bug in the code that handles wildcard so I'll take a look at fixing this also.

@olivernn
Copy link
Owner

Hmm, this might not be a problem with lunr.TokenSet since there is a test specifically for matching zero or more characters.

@Majestic7979
Copy link

Oh gosh this has been an issue since 2019 gasp
Bitwarden uses this library, and it's not possible to search for empty usernames on the Bitwarden database. Is there no timeline for fixing? No alternative to search for empty string in a field? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants