Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update substring search to use the Two Way algorithm #26327

Merged
merged 4 commits into from
Jun 30, 2015

Commits on Jun 21, 2015

  1. StrSearcher: Update substring search to use the Two Way algorithm

    To improve our substring search performance, revive the two way searcher
    and adapt it to the Pattern API.
    
    Fixes rust-lang#25483, a performance bug: that particular case now completes faster
    in optimized rust than in ruby (but they share the same order of magnitude).
    
    Much thanks to @gereeter who helped me understand the reverse case
    better and wrote the comment explaining `next_back` in the code.
    
    I had quickcheck to fuzz test forward and reverse searching thoroughly.
    
    The two way searcher implements both forward and reverse search,
    but not double ended search. The forward and reverse parts of the two
    way searcher are completely independent.
    
    The two way searcher algorithm has very small, constant space overhead,
    requiring no dynamic allocation. Our implementation is relatively fast,
    especially due to the `byteset` addition to the algorithm, which speeds
    up many no-match cases.
    
    A bad case for the two way algorithm is:
    
    ```
    let haystack = (0..10_000).map(|_| "dac").collect::<String>();
    let needle = (0..100).map(|_| "bac").collect::<String>());
    ```
    
    For this particular case, two way is not much faster than the naive
    implementation it replaces.
    Ulrik Sverdrup committed Jun 21, 2015
    Configuration menu
    Copy the full SHA
    b890b7b View commit details
    Browse the repository at this point in the history
  2. StrSearcher: Specialize is_prefix_of/is_suffix_of for &str

    Ulrik Sverdrup committed Jun 21, 2015
    Configuration menu
    Copy the full SHA
    a6dd203 View commit details
    Browse the repository at this point in the history
  3. StrSearcher: Use trait to specialize two way algorithm by case

    Use a trait to be able to implement both the fast search that skips to
    each match, and the slower search that emits `Reject` intervals
    regularly. The latter is important for uses of `next_reject`.
    Ulrik Sverdrup committed Jun 21, 2015
    Configuration menu
    Copy the full SHA
    71006bd View commit details
    Browse the repository at this point in the history

Commits on Jun 24, 2015

  1. StrSearcher: Explicitly separate the long and short cases

    This is needed to not drop performance, after the trait-based changes.
    Force separate versions of the next method to be generated for the short
    and long period cases.
    Ulrik Sverdrup committed Jun 24, 2015
    Configuration menu
    Copy the full SHA
    274bb24 View commit details
    Browse the repository at this point in the history