.decode does not handle null bytes #1

lsb · 2023-11-09T00:06:25Z

When I Ascii85Native.decode("0etOA2)[BQ3A:Ff"), I expect to see "1234567890\x001". I get "1234567890", which truncates at the null byte.

This is on the latest 1.0.3

lsb · 2023-11-09T00:12:02Z

When I built and ran the test suite in the latest ruby 3.2.2 container, I got

% docker run -it ruby bash -c 'gem install rake-compiler ffi ; git clone https://github.com/AnomalousBit/ascii85_native.git ; cd ascii85_native ; rake compile ; cd spec/lib ; ruby ascii85_native_spec.rb'
Fetching rake-compiler-1.2.5.gem
Successfully installed rake-compiler-1.2.5
Fetching ffi-1.16.3.gem
Building native extensions. This could take a while...
Successfully installed ffi-1.16.3
2 gems installed

A new release of RubyGems is available: 3.4.10 → 3.4.22!
Run `gem update --system 3.4.22` to update your installation.

Cloning into 'ascii85_native'...
remote: Enumerating objects: 105, done.
remote: Counting objects: 100% (105/105), done.
remote: Compressing objects: 100% (60/60), done.
remote: Total 105 (delta 47), reused 81 (delta 29), pack-reused 0
Receiving objects: 100% (105/105), 26.00 KiB | 5.20 MiB/s, done.
Resolving deltas: 100% (47/47), done.
mkdir -p tmp/x86_64-linux/ascii85_native/3.2.2
cd tmp/x86_64-linux/ascii85_native/3.2.2
/usr/local/bin/ruby -I. ../../../../ext/ascii85_native/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/ascii85_native/3.2.2
/usr/bin/gmake
compiling ../../../../ext/ascii85_native/ascii85_native.c
linking shared-object ascii85_native.so
cd -
mkdir -p tmp/x86_64-linux/stage/lib
/usr/bin/gmake install sitearchdir=../../../../lib sitelibdir=../../../../lib target_prefix=
/usr/bin/install -c -m 0755 ascii85_native.so ../../../../lib
cp tmp/x86_64-linux/ascii85_native/3.2.2/ascii85_native.so tmp/x86_64-linux/stage/lib/ascii85_native.so
Run options: --seed 17069

# Running:

F.F.F..F

Finished in 0.040260s, 198.7064 runs/s, 2508.6684 assertions/s.

  1) Failure:
Ascii85Native::#encode#test_0001_should encode all specified test-cases correctly [ascii85_native_spec.rb:59]:
Expected: "<~~>"
  Actual: ""

  2) Failure:
Ascii85Native::#decode#test_0001_should decode all specified test-cases correctly [ascii85_native_spec.rb:118]:
--- expected
+++ actual
@@ -1,3 +1,3 @@
 # encoding: ASCII-8BIT
 #    valid: true
-""
+"\x00"


  3) Failure:
Ascii85Native::#decode#test_0003_should only process data within delimiters [ascii85_native_spec.rb:155]:
Expected # encoding: ASCII-8BIT
#    valid: true
"o\xC8\xAB\x14\x15\xBCC\xE2\x04\x9E\xCA\x7F\xD6Z\x17\xFF\x04\xC5F" to be empty.

  4) Failure:
Ascii85Native#test_0001_#decode should be the inverse of #encode [ascii85_native_spec.rb:53]:
--- expected
+++ actual
@@ -1,3 +1,3 @@
 # encoding: ASCII-8BIT
 #    valid: true
-"\xFD\xCA\xDB\xBA=\xF6\xA9\x82TP\x04\x9Fe,`k\x9F\xCB\x1E\xFD^\xF4\x04\xDF2\xAD{\x04!K\xD7*\xF9D\"\xAB\xE9\xDC\x1A(\xF2\x14\x90\xEA^\xE2\xA6\xE8eR\x01%\xA1L\x97"
+"\xFD\xCA\xDB\xBA=\xF6\xA9\x82TP\x04\x9Fe,`k\x9F\xCB\x1E\xFD^\xF4\x04\xDF2\xAD{\x04!K\xD7*\xF9D\"\xAB\xE9\xDC\x1A(\xF2\x14\x90\xEA^\xE2\xA6\xE8eR\x01%\xA1L\x97\x00\xCF1W\xDE\xB5\x03(/\xF1\xB8\x98<\x04\xAD%B\xAD\xCAs\xC6J\x95r\xA7t\xC1T\e!\xCB\xE3\x01\x12\xABU\xF5\xCE(\xF8\xEC\xB00\xAF\x06d\x97\x83\x82}7\xF9]\xF1\xA9Q\xDFd\xEF\xCB:\xAFF\xD6Z\xAD\x7F\xE11\xED\x1F\xBA9\xD51\xA1\xA3%\xBC\x87\e\xD6+U\xCB"


8 runs, 101 assertions, 4 failures, 0 errors, 0 skips

The delimiters are not too much of an issue for me but the truncation in that four test seems like the problem. Perhaps I need to set some environment flags somehow?

AnomalousBit · 2023-11-10T04:13:10Z

Hi @lsb

I'm not aware of any specifications where Ascii85 should be null terminated. This gem does support the familiar <~ and ~> delimiters, though.

I've checked several other libraries and even this In-browser converter and haven't seen a null-terminated string show up anywhere.

I'm guessing that you're trying to use the output of decode() as an argument to another C function that expects a null terminated char* array? In this case, I would just append the null character to the end of the output of decode() before passing it along to your next C function.

Many of the tests for Ascii85Native are borrowed from the Ascii85 gem, which is intentional because I wanted a performance-focused drop-in replacement for that particular gem. There are lot of tests that check for behavior of invalid Ascii85 encoded streams. This library isn't doing those sanity checks because some of them require to you scan the entire stream before you even try to decode it, making the decode process take significantly longer.

I felt it would be disingenuous to remove those failing tests because it makes clear to anyone who's gotten this far that this is library is really just about the encoding and decoding process. If you need something with more validity checks and unique options like zero-value-compression (conversion of !!!!! to z) I would recommend you do one of the following:

Write your own sanity checks before calling Ascii85Native (would be awesome if you shared a PR with an optional argument added to .decode() to turn on validation! Maybe one of these days I'll get to that.)
Hop on over to using the Ascii85 gem
Compile one of the other FOSS C implementations such as C-implementation of Ascii85 and then call out to that executable using one of the various Ruby native methods, like system() or backticks as you can see here.

This being said, I've used this in production for a couple of years now on thousands of PDFs and haven't had a single problem, as of yet. It's been a huge performance gain for me, but YMMV. Time permitting, I will try to revisit some of the low-hanging fruit on the failing tests soon, there are several that are quick fixes. I'm open to any PRs you might want to share to "spruce up the place!"

Hope this helps, good luck!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.decode does not handle null bytes #1

.decode does not handle null bytes #1

lsb commented Nov 9, 2023

lsb commented Nov 9, 2023 •

edited

Loading

AnomalousBit commented Nov 10, 2023

.decode does not handle null bytes #1

.decode does not handle null bytes #1

Comments

lsb commented Nov 9, 2023

lsb commented Nov 9, 2023 • edited Loading

AnomalousBit commented Nov 10, 2023

lsb commented Nov 9, 2023 •

edited

Loading