A header only C++ implementation of the Aho-Corasick algorithm. It aims to be reasonably fast and thread safe.
Construct a match trie in one of the following manners. New patterns cannot be added to a trie once it is built.
corsicana::trie_builder my_trie_builder;
auto my_trie = my_trie_builder.insert("pattern one")
.insert("pattern two")
.build();
corsicana::trie_builder my_trie_builder;
auto my_trie = my_trie_builder.insert(container.begin(), container.end()).build();
corsicana::trie_builder my_trie_builder(container.begin(), container.end());
auto my_trie = my_trie_builder.build();
corsicana::trie_builder my_trie_builder = { "one", "two", "three" };
auto my_trie = my_trie_builder.build();
There are a number of different ways to search on a frozen trie
auto match = my_trie.match("Input Text");
// get all matches at once
vector<corsicana::result> all = match.all();
auto match = my_trie.match("Input Text");
// get the count of matches
int total = match.count();
auto match = my_trie.match("Input Text");
// return true if we can find any matches
bool any_there = match.any();
auto match = my_trie.match("Input Text");
// or iterate over them one at a time
for (auto const& m : match) {
// iteration will search one at a time and can be stopped at any time
}
Tests are written using Catch2 and can be executed by
running ctest
after building
Testing can be disabled by setting BUILD_TESTING to OFF in cmake
Benchmarks and a harness to write additional benchmarks is included in the benchmark directory.
Benchmarks can be disabled by setting BUILD_BENCHMARK to OFF in cmake