-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CompiledAutomaton not equal when NFARunAutomaton is non-null #13715
Comments
The NFARunAutomaton internally carries a cache such that it will remember all the states that has been determinized, so it does not make sense to implement a completely strict Or, maybe more strictly, if the |
I'm not sure if we can guarantee true equality check for arbitrary automata. Perhaps a "shallow" check meaning equivalence of states and transitions but not the true equality in terms of accepting the same language. |
Before considering possible fixes, do we agree that there is a problem worth fixing? I’m particular thinking of the equality of IntervalQuery. |
I think it's a nice to have one. Altho for the specific IntervalQuery I feel like maybe we can directly claim they're equal if the pattern is the same and not checking the automaton at all. |
@ChrisHegarty I think it might be an oversight IntervalQuery is getting an NFA, was that really intended? The following other methods on intervals are powered by automatons and get a DFA:
This happens because they reuse the parsing from the associated Query classes, but this Intervals.regexp() neglects to do that and just creates its own RegExp and converts it to an NFA. Given that the default of RegexpQuery is to determinize, I think we should determinize here to avoid surprises such as this? |
There are other related problems to fix here separately, just start with CompiledAutomaton:
|
Did we uncover a hornets nest!? |
That's usually how it works, it is good stuff. To be practical, for now, I'd recommend just fixing Intervals.regex to I think it is enough for lucene 10 that the queries no longer |
NFARunAutomaton is does not override
equals
, so defaults to object identity, which means that classes likeCompiledAutomaton
that created internal instances of it may not appear equal when they in fact are. This issue has been filed to investigate the possibility of adding a NFARunAutomaton::equals override that does something sensible with its internal state.I ran into this issue when comparing this like with Lucene 10:
The text was updated successfully, but these errors were encountered: