Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
From what I can see, this is done in linear time: 4*O(n) This tokenizer change converts that to something a little quicker: 3*O(n) Seems that not using a capture group and something other than split would be a big win. Other than that, the changes were meager. I used https://regex101.com/ (and pcre2) to evaluate the cost of the TOKENIZER. I verified with cruby 3.0.6 ``` /(%%\{[^\}]+\}|%\{[^\}]+\})/ =~ '%{{'*9999)+'}' /(%%\{[^\}]+\}|%\{[^\}]+\})/ ==> 129,990 steps /(%?%\{[^\}]+\})/ ==> 129,990 steps /(%%?\{[^\}]+\})/ ==> 99,992 steps (simple savings of 25%) <=== /(%%?\{[^%}{]+\})/ ==> 89,993 steps (limiting variable contents has minimal gains) ``` Also of note are the null/simple cases: ``` /x/ =~ '%{{'*9999)+'}' /x/ ==> 29,998 steps /(x)/ ==> 59,996 steps /%{x/ ==> 49,998 steps /(%%?{x)/ ==> 89,993 steps ``` And the plain string that doesn't fair too much worse than the specially crafted string ``` /x/ =~ 'abb'*9999+'c' /x/ ==> 29,999 /(%%?{x)/ ==> 59,998 /(%%?\{[^\}]+\})/ ==> 59,998 /(%%\{[^\}]+\}|%\{[^\}]+\})/ ==> 89,997 ``` per #667
- Loading branch information