SEGV hs_scan with HS_FLAG_SOM_LEFTMOST #350

noboruma · 2022-03-09T01:34:33Z

Hello there,

I was experimenting with scratch buffer allocation and noticed that one specific pattern would cause a crash. The code I used is as follow, I have 2 similar patterns and only allocate the scratch buffer for pattern1 but not pattern2:

#include <hs/hs_compile.h>
#include <stdio.h>
#include <string.h>
#include <hs/hs.h>

static int eventHandler(unsigned int id, unsigned long long from,
                        unsigned long long to, unsigned int flags, void *ctx) {
    printf("Match for pattern \"%s\" at offset %llu\n", (char *)ctx, to);
    return 0;
}

int main(int argc, char *argv[]) {
    const char *pattern1 = "^(ftps?|https?).*?\\?+$";
    const char *pattern2 = "^(ftps?|https?)://.*$";

    hs_compile_error_t *compile_err;
    hs_database_t *database1;
    hs_compile(pattern1, HS_FLAG_SOM_LEFTMOST, HS_MODE_BLOCK, NULL, &database1,
                   &compile_err);

    hs_database_t *database2;
    hs_compile(pattern2, HS_FLAG_SOM_LEFTMOST, HS_MODE_BLOCK, NULL, &database2,
                   &compile_err);

    const char *inputData = "https://foo.bar/";
    hs_scratch_t *scratch = NULL;
    hs_alloc_scratch(database1, &scratch);
    hs_scan(database1, inputData, strlen(inputData), 0, scratch, eventHandler,
                (void*)pattern1);
    hs_scan(database2, inputData, strlen(inputData), 0, scratch, eventHandler,
                (void*)pattern2);

    return 0;
}

Expected results: the second hs_scan should return an error, since the scratch buffer was not allocated properly for that pattern
Actual results: the second hs_scan is SEGV, here is the stack trace:

#0  0x00007ffff7c3b5cd in memset (__len=32, __ch=0, __dest=0x0)
    at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:71
#1  fatbit_clear (bits=0x0) at ./src/util/fatbit.h:60
#2  recordAnchoredLiteralMatch (end=8, anch_id=0, scratch=0x7fffcc00f240, t=<optimized out>)
    at ./src/rose/program_runtime.c:101
#3  roseRunProgram (t=<optimized out>, scratch=<optimized out>, programOffset=<optimized out>, som=0,
    end=<optimized out>, prog_flags=<optimized out>) at ./src/rose/program_runtime.c:2278
#4  0x00007ffff7c22fd0 in roseAnchoredCallback (start=start@entry=0, end=<optimized out>,
    id=id@entry=1888, ctx=ctx@entry=0x7fffcc00f240) at ./src/rose/match.c:215
#5  0x00007ffff7bd5a04 in doComplexReport (cached_accept_id=<synthetic pointer>,
    cached_accept_state=<synthetic pointer>, eod=0 '\000', loc=8, s=21, m=<optimized out>,
    ctxt=0x7fffcc00f240, cb=0x7ffff7c22f70 <roseAnchoredCallback>) at ./src/nfa/mcclellan.c:74
#6  mcclellanExec8_i (mode=CALLBACK_OUTPUT, c_final=0x0, single=0 '\000', ctxt=0x7fffcc00f240,
    cb=0x7ffff7c22f70 <roseAnchoredCallback>, offAdj=0, len=8, buf=0x7fffd4014580 "https://foo.bar/",
    state=<synthetic pointer>, m=<optimized out>) at ./src/nfa/mcclellan.c:502
#7  nfaExecMcClellan8_Bi (single=0 '\000', context=0x7fffcc00f240,
    length=8, buffer=0x7fffd4014580 "https://foo.bar/",
    offset=0, n=<optimized out>) at ./src/nfa/mcclellan.c:922
#8  nfaExecMcClellan8_B (n=<optimized out>, offset=0, buffer=0x7fffd4014580 "https://foo.bar/", length=8,
    cb=0x7ffff7c22f70 <roseAnchoredCallback>, context=0x7fffcc00f240) at ./src/nfa/mcclellan.c:945
#9  0x00007ffff7c0dbe1 in runAnchoredTableBlock (scratch=0x7fffcc00f240, atable=<optimized out>,
    t=0x555557cfde00) at ./src/rose/block.c:63
#10 roseBlockAnchored (scratch=0x7fffcc00f240, t=0x555557cfde00) at ./src/rose/block.c:212
#11 roseBlockExec (t=<optimized out>, scratch=<optimized out>) at ./src/rose/block.c:395
#12 0x00007ffff7b44f4f in rawBlockExec (scratch=0x7fffcc00f240, rose=0x555557cfde00)
    at ./src/runtime.c:188
#13 hs_scan (db=<optimized out>, data=<optimized out>, length=15, flags=<optimized out>,
    scratch=0x7fffcc00f240, onEvent=<optimized out>, userCtx=<optimized out>) at ./src/runtime.c:419

Note: there is no crash if I remove the HS_FLAG_SOM_LEFTMOST flag.
Workaround: reallocating the scratch buffer with the second pattern solves the issue.

It sounds to me that some sanity checks are missing inside hs_scan.

The text was updated successfully, but these errors were encountered:

Fixes intel#350

hongyang7 · 2022-05-31T17:41:37Z

Thanks for reporting this. This is indeed a historical issue left. We'll consider to add suitable quick check for scratch validity later.

fix github issue #350

hongyang7 · 2022-10-27T18:18:29Z

Please refer to latest develop branch.
Commit id: 7c65cfc

fix github issue #350

jlucovsky · 2023-03-05T15:26:33Z

This document recommends the allocation of one scratch space per scanning context.

Suricata relies on this performance technique. The commit for this change has caused this Suricata issue. I investigated and discovered that the validity check comparing the scratch and database crc values was detecting a mismatch since the scratch area is per-Suricata thread context but a per-database scratch area is expected.

Here's the performance tip from the documentation link:
Tip
A scratch space can be allocated so that it can be used with any one of a number of databases. Each concurrent scan operation (such as a thread) needs its own scratch space.

The hs_alloc_scratch() function can accept an existing scratch space and “grow” it to support scanning with another pattern database. This means that instead of allocating one scratch space for every database used by an application, one can call hs_alloc_scratch() with a pointer to the same hs_scratch_t and it will be sized appropriately for use with any of the given databases.

hongyang7 · 2023-03-05T18:11:53Z

@jlucovsky
Yes you're right. We already notice this issue in Snort integration and Rspamd.
A quick roll back patch for this particular issue will be available soon.
Sorry for any inconvenience.

jlucovsky · 2023-03-05T21:29:07Z

@jlucovsky Yes you're right. We already notice this issue in Snort integration and Rspamd. A quick roll back patch for this particular issue will be available soon. Sorry for any inconvenience.

Thanks for addressing the issue so quickly!

Roll back fix for github issue #350 About Scratch Usage: For compile time, scratch space is strongly recommended to be allocated immediately after database generation. For runtime, besides using scratch for corresponding database, Hyperscan also allows user to use larger scratch space allocated for another database. When multiple concurrent threads need to use the same databases and a new scratch space is required, cloning the largest one is always safe. This is realized based on API hs_scratch_size() and hs_clone_scratch(). Behaviors beyond above are discouraged and results are undefined.

hongyang7 · 2023-03-06T08:35:25Z

@jlucovsky Please refre to latest master. fix id: 0bf86a7
Any questions, let us know.

jlucovsky · 2023-03-06T13:32:20Z

Thanks. I've verified proper operation on the Suricata LTS and current (master) releases.

Thanks!

jlucovsky · 2023-03-08T13:11:20Z

@hongyang7 Will there be a new release with the update? Many of our users only used "released" software. Thus, could the change be in 5.4.2?

hongyang7 · 2023-04-19T09:42:33Z

@jlucovsky Please check for latest update.

fix github issue intel#350

Roll back fix for github issue intel#350 About Scratch Usage: For compile time, scratch space is strongly recommended to be allocated immediately after database generation. For runtime, besides using scratch for corresponding database, Hyperscan also allows user to use larger scratch space allocated for another database. When multiple concurrent threads need to use the same databases and a new scratch space is required, cloning the largest one is always safe. This is realized based on API hs_scratch_size() and hs_clone_scratch(). Behaviors beyond above are discouraged and results are undefined.

alessandrod added a commit to deepfence/hyperscan that referenced this issue Mar 13, 2022

runtime: add missing checks to validScratch

8c408a0

Fixes intel#350

fatchanghao pushed a commit that referenced this issue Oct 27, 2022

scratch: add quick validity check

7c65cfc

fix github issue #350

fatchanghao closed this as completed Oct 28, 2022

fatchanghao pushed a commit that referenced this issue Feb 15, 2023

scratch: add quick validity check

277fc40

fix github issue #350

hongyang7 mentioned this issue Mar 5, 2023

Make a public function to verify sanity of the database #389

Open

jmtaylor90 mentioned this issue Mar 13, 2023

New Hyperscan Release to Address Validity Check Bug #392

Open

tbaggaley mentioned this issue Mar 13, 2023

hyperscan: 5.4 -> 5.4.1 NixOS/nixpkgs#221028

Closed

12 tasks

markos pushed a commit to VectorCamp/vectorscan that referenced this issue Sep 5, 2023

scratch: add quick validity check

7f2f7d2

fix github issue intel#350

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SEGV hs_scan with HS_FLAG_SOM_LEFTMOST #350

SEGV hs_scan with HS_FLAG_SOM_LEFTMOST #350

noboruma commented Mar 9, 2022 •

edited

Loading

hongyang7 commented May 31, 2022

hongyang7 commented Oct 27, 2022

jlucovsky commented Mar 5, 2023

hongyang7 commented Mar 5, 2023 •

edited

Loading

jlucovsky commented Mar 5, 2023

hongyang7 commented Mar 6, 2023

jlucovsky commented Mar 6, 2023

jlucovsky commented Mar 8, 2023

hongyang7 commented Apr 19, 2023

SEGV hs_scan with HS_FLAG_SOM_LEFTMOST #350

SEGV hs_scan with HS_FLAG_SOM_LEFTMOST #350

Comments

noboruma commented Mar 9, 2022 • edited Loading

hongyang7 commented May 31, 2022

hongyang7 commented Oct 27, 2022

jlucovsky commented Mar 5, 2023

hongyang7 commented Mar 5, 2023 • edited Loading

jlucovsky commented Mar 5, 2023

hongyang7 commented Mar 6, 2023

jlucovsky commented Mar 6, 2023

jlucovsky commented Mar 8, 2023

hongyang7 commented Apr 19, 2023

noboruma commented Mar 9, 2022 •

edited

Loading

hongyang7 commented Mar 5, 2023 •

edited

Loading