-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leading wildcards on certain search terms cause 100% CPU, freezing browser #368
Comments
It's entirely possible that the wildcard handling code is sub-optimal in Lunr. I think that is more likely the cause rather than there being special characters in that string. What are you trying to search for in the URL? You could try splitting it into more meaningful parts, e.g domain, subdomain, path, query etc? |
@olivernn The example jsfiddle I posted above shows what I am searching for: "am" Another thing I noticed is that if I just search for "a", the problem does not occur. Unfortunately the nature of this field comes from user input and could have anything in it. It is not well structured. This URL was just an example I found during testing. |
@olivernn Trying to debug this more today since we have more users reporting the problem now. The problem is definitely showing in This was searching for
|
Some more findings with this particular use case: The full value of the URL that causes the problem here is:
This takes ~30000 ms to complete the If I reduce the value of the URL to the following, the same
I have only removed the following from the tail of the URL:
|
So I found another term that isn't a crazy long URL that will also cause this same problem. It appears that the issue always occurs with a leading wildcard. Trailing wildcards do not cause a problem. Indexed value: Search query: Takes 47 seconds to complete on my powerful desktop machine.
|
Sorry for the delay in responding, and thanks for taking the time to dig a bit more into this issue. This seems similar to #270. I think this is a case of 'catastrophic backtracking'. Lunr isn't using a regex but the process of matching on wildcards is similar. I think the issue will be in |
Indeed, it does seem to have something to do with repeating values. Here is an even simpler example:
Let me know if there is anything more I can do to help. |
So, I did a bit of digging and I found this. I cannot remember why it was there in the first place, plus the TODO comment seems to imply that at some point in the past I was also unsure about it. All the current tests pass without it, plus, the particular case you mention in this issue is no longer a problem. The performance for all wildcard matching also improves (slightly) without it. It seems like removing it should be fine, but I'm a bit cautious. I need to convince myself that it isn't serving any purpose. I'm going to write some more tests and try and trace through the implementation to try and remember what I was thinking 3 years ago when adding that code! |
Just another observation: It seems that the more repeat letters that exist, the worse the problem becomes. For example with the previous test, |
I've just published 2.3.3 which includes a fix for this issue. Let me know if it works for you, thanks. |
Seems to be working well now. |
See the following example: https://jsfiddle.net/8s0dzp2c/22/
Something about the amazon URL being stored in this
url
field causes the browser to freeze and become unresponsive when searching by a wildcardquery
.If I trim the URL to something like
https://www.amazon.com
, there is no issue. See here: https://jsfiddle.net/8s0dzp2c/24/Also, if I remove the wildcard altogether, it doesn't freeze anymore.
Is there some character in the URL that is causing things to bug out or something?
The text was updated successfully, but these errors were encountered: