Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.decode does not handle null bytes #1

Open
lsb opened this issue Nov 9, 2023 · 2 comments
Open

.decode does not handle null bytes #1

lsb opened this issue Nov 9, 2023 · 2 comments

Comments

@lsb
Copy link

lsb commented Nov 9, 2023

When I Ascii85Native.decode("0etOA2)[BQ3A:Ff"), I expect to see "1234567890\x001". I get "1234567890", which truncates at the null byte.

This is on the latest 1.0.3

@lsb
Copy link
Author

lsb commented Nov 9, 2023

When I built and ran the test suite in the latest ruby 3.2.2 container, I got

% docker run -it ruby bash -c 'gem install rake-compiler ffi ; git clone https://github.com/AnomalousBit/ascii85_native.git ; cd ascii85_native ; rake compile ; cd spec/lib ; ruby ascii85_native_spec.rb'
Fetching rake-compiler-1.2.5.gem
Successfully installed rake-compiler-1.2.5
Fetching ffi-1.16.3.gem
Building native extensions. This could take a while...
Successfully installed ffi-1.16.3
2 gems installed

A new release of RubyGems is available: 3.4.10 → 3.4.22!
Run `gem update --system 3.4.22` to update your installation.

Cloning into 'ascii85_native'...
remote: Enumerating objects: 105, done.
remote: Counting objects: 100% (105/105), done.
remote: Compressing objects: 100% (60/60), done.
remote: Total 105 (delta 47), reused 81 (delta 29), pack-reused 0
Receiving objects: 100% (105/105), 26.00 KiB | 5.20 MiB/s, done.
Resolving deltas: 100% (47/47), done.
mkdir -p tmp/x86_64-linux/ascii85_native/3.2.2
cd tmp/x86_64-linux/ascii85_native/3.2.2
/usr/local/bin/ruby -I. ../../../../ext/ascii85_native/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/ascii85_native/3.2.2
/usr/bin/gmake
compiling ../../../../ext/ascii85_native/ascii85_native.c
linking shared-object ascii85_native.so
cd -
mkdir -p tmp/x86_64-linux/stage/lib
/usr/bin/gmake install sitearchdir=../../../../lib sitelibdir=../../../../lib target_prefix=
/usr/bin/install -c -m 0755 ascii85_native.so ../../../../lib
cp tmp/x86_64-linux/ascii85_native/3.2.2/ascii85_native.so tmp/x86_64-linux/stage/lib/ascii85_native.so
Run options: --seed 17069

# Running:

F.F.F..F

Finished in 0.040260s, 198.7064 runs/s, 2508.6684 assertions/s.

  1) Failure:
Ascii85Native::#encode#test_0001_should encode all specified test-cases correctly [ascii85_native_spec.rb:59]:
Expected: "<~~>"
  Actual: ""

  2) Failure:
Ascii85Native::#decode#test_0001_should decode all specified test-cases correctly [ascii85_native_spec.rb:118]:
--- expected
+++ actual
@@ -1,3 +1,3 @@
 # encoding: ASCII-8BIT
 #    valid: true
-""
+"\x00"


  3) Failure:
Ascii85Native::#decode#test_0003_should only process data within delimiters [ascii85_native_spec.rb:155]:
Expected # encoding: ASCII-8BIT
#    valid: true
"o\xC8\xAB\x14\x15\xBCC\xE2\x04\x9E\xCA\x7F\xD6Z\x17\xFF\x04\xC5F" to be empty.

  4) Failure:
Ascii85Native#test_0001_#decode should be the inverse of #encode [ascii85_native_spec.rb:53]:
--- expected
+++ actual
@@ -1,3 +1,3 @@
 # encoding: ASCII-8BIT
 #    valid: true
-"\xFD\xCA\xDB\xBA=\xF6\xA9\x82TP\x04\x9Fe,`k\x9F\xCB\x1E\xFD^\xF4\x04\xDF2\xAD{\x04!K\xD7*\xF9D\"\xAB\xE9\xDC\x1A(\xF2\x14\x90\xEA^\xE2\xA6\xE8eR\x01%\xA1L\x97"
+"\xFD\xCA\xDB\xBA=\xF6\xA9\x82TP\x04\x9Fe,`k\x9F\xCB\x1E\xFD^\xF4\x04\xDF2\xAD{\x04!K\xD7*\xF9D\"\xAB\xE9\xDC\x1A(\xF2\x14\x90\xEA^\xE2\xA6\xE8eR\x01%\xA1L\x97\x00\xCF1W\xDE\xB5\x03(/\xF1\xB8\x98<\x04\xAD%B\xAD\xCAs\xC6J\x95r\xA7t\xC1T\e!\xCB\xE3\x01\x12\xABU\xF5\xCE(\xF8\xEC\xB00\xAF\x06d\x97\x83\x82}7\xF9]\xF1\xA9Q\xDFd\xEF\xCB:\xAFF\xD6Z\xAD\x7F\xE11\xED\x1F\xBA9\xD51\xA1\xA3%\xBC\x87\e\xD6+U\xCB"


8 runs, 101 assertions, 4 failures, 0 errors, 0 skips

The delimiters are not too much of an issue for me but the truncation in that four test seems like the problem. Perhaps I need to set some environment flags somehow?

@AnomalousBit
Copy link
Owner

Hi @lsb

I'm not aware of any specifications where Ascii85 should be null terminated. This gem does support the familiar <~ and ~> delimiters, though.

I've checked several other libraries and even this In-browser converter and haven't seen a null-terminated string show up anywhere.

I'm guessing that you're trying to use the output of decode() as an argument to another C function that expects a null terminated char* array? In this case, I would just append the null character to the end of the output of decode() before passing it along to your next C function.

Many of the tests for Ascii85Native are borrowed from the Ascii85 gem, which is intentional because I wanted a performance-focused drop-in replacement for that particular gem. There are lot of tests that check for behavior of invalid Ascii85 encoded streams. This library isn't doing those sanity checks because some of them require to you scan the entire stream before you even try to decode it, making the decode process take significantly longer.

I felt it would be disingenuous to remove those failing tests because it makes clear to anyone who's gotten this far that this is library is really just about the encoding and decoding process. If you need something with more validity checks and unique options like zero-value-compression (conversion of !!!!! to z) I would recommend you do one of the following:

  • Write your own sanity checks before calling Ascii85Native (would be awesome if you shared a PR with an optional argument added to .decode() to turn on validation! Maybe one of these days I'll get to that.)
  • Hop on over to using the Ascii85 gem
  • Compile one of the other FOSS C implementations such as C-implementation of Ascii85 and then call out to that executable using one of the various Ruby native methods, like system() or backticks as you can see here.

This being said, I've used this in production for a couple of years now on thousands of PDFs and haven't had a single problem, as of yet. It's been a huge performance gain for me, but YMMV. Time permitting, I will try to revisit some of the low-hanging fruit on the failing tests soon, there are several that are quick fixes. I'm open to any PRs you might want to share to "spruce up the place!"

Hope this helps, good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants