Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PCRE2 instead of PCRE #153

Merged
merged 3 commits into from
Oct 23, 2023
Merged

Use PCRE2 instead of PCRE #153

merged 3 commits into from
Oct 23, 2023

Conversation

bjosv
Copy link
Collaborator

@bjosv bjosv commented Sep 5, 2023

PCRE is now at end of life and is no longer actively maintained.
This PR lifts the required version to the maintained PCRE2, which makes sure that security vulnerability fixes are available.

There are some differences in behaviour that is expressed in this list, but in general PCRE2 seems to have a more explicit pattern interpreter, so invalid patterns are checked more aggressively.
There are no need to change existing testcases which is a good sign.

This PR also removes the PCRE study option since:
The new API ... was simplified by abolishing the separate "study" optimizing function; in PCRE2, patterns are automatically optimized where possible. (link)

Fixes #112

PCRE is now at end of life, and is no longer being actively maintained.
Lift the required version to the major version PCRE2.

Removed the pcre study option since:
"The new API ... was simplified by abolishing the separate "study" optimizing
function; in PCRE2, patterns are automatically optimized where possible."
If asprintf() fails the content of the 'strp' variable is undefined.
Lets check the return value and return NULL upon error.
https://man7.org/linux/man-pages/man3/asprintf.3.html
Pattern and subject can straighforwardly be cast to PCRE2_SPTR since we only work
with 8-bit code units.
@bjosv
Copy link
Collaborator Author

bjosv commented Oct 18, 2023

As a reference here are some benchmark tests of r3 with and without this PR, and using different version of PCRE2.
The latest PCRE2 version is required to get a bit better performance.

Results from benchmarking of "pcre_dispatch"

PCRE version Min Median Max
pcre 8.39 1.418703 seconds
10573038.67 i/sec
1.511755 seconds
9922242.76 i/sec
1.572873 seconds
9536687.89 i/sec
pcre2 10.42 1.399972 seconds
10714500.30 i/sec
1.488088 seconds
10080049.76 i/sec
1.551644 seconds
9667165.38 i/sec
pcre2 10.41 1.442291 seconds
10400120.21 i/sec
1.500147 seconds
9999020.99 i/sec
1.608765 seconds
9323923.04 i/sec
pcre2 10.39 1.467327 seconds
10222671.04 i/sec
1.535990 seconds
9765688.58 i/sec
1.667526 seconds
8995361.96 i/sec
pcre2 10.37 1.504090 seconds
9972807.01 i/sec
1.540420 seconds
9737603.68 i/sec
1.691264 seconds
8869106.63 i/sec

Results taken from:
./run-benchmark (10 iterations of running 'bench', 3 runs of pcre_dispatch, 5000000 iterations each run)

gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
JIT enabled on PCRE/PCRE2

@c9s
Copy link
Owner

c9s commented Oct 19, 2023

wow! this is just too awesome!

@bjosv bjosv merged commit c105117 into c9s:2.0 Oct 23, 2023
10 of 11 checks passed
@bjosv bjosv deleted the pcre2 branch October 23, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PCRE2
2 participants