Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download and patch Vectorscan sources instead of vendoring them #151

Merged
merged 9 commits into from
Mar 25, 2024

Conversation

seqre
Copy link
Contributor

@seqre seqre commented Mar 14, 2024

As we discussed internally, I implemented a solution of downloading Vectorscan sources when missing and applying a patch file with our custom changes instead of permanently storing the patched version in the repository. Implements #71.

It works by downloading Vectorscan sources from GitHub archives, decompressing, and unarchiving to disk. Then, it applies our custom .patch file. At the end of each phase, it writes relevant data into a .version file to keep track of completed changes so they don't need to be repeated on each build. I'm aware it's pretty simple validation, but it works pretty well.

crates/vectorscan-sys/build.rs Fixed Show fixed Hide fixed
crates/vectorscan-sys/build.rs Fixed Show fixed Hide fixed
crates/vectorscan-sys/build.rs Fixed Show fixed Hide fixed
@seqre
Copy link
Contributor Author

seqre commented Mar 15, 2024

The Docker image builds just fine on Debian, but I have no idea why the Alpine version stopped working out of the blue.

  = note: /usr/lib/gcc/x86_64-alpine-linux-musl/12.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lssl: No such file or directory
          /usr/lib/gcc/x86_64-alpine-linux-musl/12.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lcrypto: No such file or directory
          /usr/lib/gcc/x86_64-alpine-linux-musl/12.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lz: No such file or directory
          collect2: error: ld returned 1 exit status

My best bet by looking at the error is the assumption that libraries responsible for handling .tar.gz (flate2 and tar) might require some additional system libraries, but their documentation doesn't mention anything. I tried installing a bunch of different dev packages in the container, but it didn't change anything.

@bradlarsen
Copy link
Collaborator

I'll take a look at it later. In my experience, that sort of failure is business as usual when dealing with C++ dependencies...

@seqre
Copy link
Contributor Author

seqre commented Mar 16, 2024

Just as a sanity check, I changed the code to download the ZIP file and extract it using zip_extract to see if there would be any difference (it's actually cleaner solution tbh, it doesn't require renaming extracted directory).

let response = reqwest::blocking::get(format!("https://github.com/VectorCamp/vectorscan/archive/refs/tags/vectorscan/{VERSION}.zip")).expect("Could not download Vectorscan source files");
let bytes = response.bytes().expect("Could not read Vectorscan zip file");
zip_extract::extract(Cursor::new(bytes), &vectorscan_src_dir, true).expect("Could not extract Vectorscan source files");

Unfortunately, it ended up being even worse as it couldn't find more libraries (with me attempting to download those):

36.00   = note: /usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lbz2: No such file or directory
36.00           /usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lssl: No such file or directory
36.00           /usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lcrypto: No such file or directory
36.00           /usr/lib/gcc/x86_64-alpine-linux-musl/13.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lz: No such file or directory
36.00           collect2: error: ld returned 1 exit status

@bradlarsen
Copy link
Collaborator

Yes, this behavior is odd. One would expect that extracting a tarball and applying the patches would give the same behavior as the current vendored approach, as the vectorscan source directory trees should be the same in either case.

I will take a closer look at this PR this week.

@seqre seqre force-pushed the patched-vectorscan branch 2 times, most recently from 2d1fafc to 60da0d0 Compare March 20, 2024 01:38
seqre and others added 8 commits March 25, 2024 11:44
- Re-create the vectorscan.patch patchfile using `diff -ruN`
- In build.rs, use the `patch` program instead of the `git2` dependency
- In build.rs, use a checked-in copy of the vectorscan tarball instead
  of downloading it from GitHub
- In build.rs, extract the tarball to `OUT_DIR` instead of
  `CARGO_MANIFEST_DIR`
@bradlarsen bradlarsen merged commit 693c17f into praetorian-inc:main Mar 25, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants