Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't unzip .zip files created by OS X 10.12's native zipper #63

Closed
coolaj86 opened this issue Oct 26, 2017 · 10 comments
Closed

Can't unzip .zip files created by OS X 10.12's native zipper #63

coolaj86 opened this issue Oct 26, 2017 · 10 comments

Comments

@coolaj86
Copy link

coolaj86 commented Oct 26, 2017

Error

On Mac OS X 10.12.6 the native zip utility (right-click "compress folder") creates a zip file that gives this error:

extra field length exceeds extra field buffer size

Not that I don't believe it's possible for Apple to make a mistake, but my guess is that the underlying zip utility is probably the same Unix / Darwin zip that has been in use for a very long time and that it's statistically more likely to be a problem with this library.

For reference, here is the failing zip file (just a couple of test files for a website):

Example Zip File

I've attached fails-to-parse.zip for the test case.

The contents are as follows:

.
├── css
│   ├── custom.css
│   ├── plugins
│   │   └── bootstrap.css.min
│   └── styles.css
├── index-three.html
├── index-two.html
├── index.html
└── js
    ├── app.js
    ├── config.js
    └── custom.js

I have also zipped files that do unzip just fine. My guess is that there's some sort of off-by-one byte alignment issue.

Test Case

And a test case (based on the example):

mkdir -p /tmp/yauzl-test
pushd /tmp/yauzl-test
npm install yauzl@latest
touch test.js
wget -c https://github.com/thejoshwolfe/yauzl/files/1420073/fails-to-parse.zip
# => copy code below into test.js <= #
node test.js

test.js:

'use strict';

var yauzl = require('yauzl');

yauzl.open('./fails-to-parse.zip', function (err, zipfile) {
  if (err) throw err;

  zipfile.readEntry();
  zipfile.on('entry', function (entry) {
    if (/\/$/.test(entry.fileName)) {
      zipfile.readEntry();
      return;
    }

    zipfile.openReadStream(entry, function(err, readStream) {
      if (err) throw err;

      readStream.on("end", function() {
        zipfile.readEntry();
      });

      readStream.on("data", function (data) {
        console.log("data of length", data.length);
      });
    });
  });
  zipfile.on('error', function (err) {
    throw err;
  });
  zipfile.on('end', function () {
    console.log("all entries read and processed");
  });
});
@thejoshwolfe
Copy link
Owner

Thanks for the report. i'll take a look and get back to you.

@ffxsam
Copy link

ffxsam commented Oct 27, 2017

I'm getting this same error on a file created with command line zip on macOS.

@ffxsam
Copy link

ffxsam commented Oct 27, 2017

Tried zipping using BetterZip. Different error:

Error: unexpected EOF

So nothing works for me yet. The weird thing is, this error is being thrown and it's not coming through the callback. AFAIK, zipfile.on('entry', ...) callback function only has entry as an argument, no error object.

Side note: is there no easy way to just unzip and write the files to a folder? Without having to mess with streams, readEntry, etc?

@coolaj86
Copy link
Author

coolaj86 commented Oct 27, 2017

I tried with a few of the other node.js libs namely jszip, unzipper (and improved fork of unzip), and node-stream-zip and they work.

unzipper and node-stream-zip seem to be the best for me.

If you don't want streaming (i.e. an entire 1gb file would have to load into memory) then adm-zip is probably the most loved and respected. If you need browser support too then jszip may be the right fit.

@ffxsam
Copy link

ffxsam commented Oct 28, 2017

I went with one called decompress I believe. Simple, works well. I was on a deadline, so couldn't wait to hear back on this issue.

@thejoshwolfe
Copy link
Owner

I am unable to reproduce your issue. The zipfile you linked seems to parse just fine for me. I can explain what that particular error means, but it doesn't seem to be relevant to your zipfile.

Also, I believe decompress is implemented by using yauzl behind the scenes, so I'm not sure how switching to that library resolved this issue for you. Perhaps there are different versions of yauzl at play here.

I tried with a few of the other node.js libs ...

jszip - i didn't find much API documentation, but it looks like it operates on ram buffers, which means it's unsuitable for large zipfiles.

unzipper and node-stream-zip attempt to parse zipfiles from start to finish, which means they don't support some well-formed zipfiles, specifically those using general purpose bit 3. see No Streaming Unzip API in yauzl's readme.

Here's a fairly harmless looking way to create a zipfile using Info-ZIP (which is the zip command line program in most unix-like environments), which cannot be parsed with a start-to-finish streaming parser:

$ echo -ne '\x50\x4b\x07\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > file.txt
$ zip -q0 - file.txt | cat > out.zip

My experimentation confirms my theory that unzipper cannot parse that zipfile.

@coolaj86
Copy link
Author

Here's me showing it fail in about 72 seconds: https://youtu.be/0rKj9TZkPD4

Here are my versions:

yauzl-test@1.0.0 /private/tmp/yauzl-test
└─┬ yauzl@2.8.0
  ├── buffer-crc32@0.2.13
  └─┬ fd-slicer@1.0.1
    └── pend@1.2.0

node --version
7.8.0

npm --version
4.2.0

@coolaj86
Copy link
Author

coolaj86 commented Oct 28, 2017

Just updated to latest node, npm, and retried after rm -rf node_modules.

aj@bowie /p/t/yauzl-test> npm install --save yauzl
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN yauzl-test@1.0.0 No description
npm WARN yauzl-test@1.0.0 No repository field.

+ yauzl@2.8.0
added 4 packages in 4.303s
aj@bowie /p/t/yauzl-test> node test.js
/private/tmp/yauzl-test/test.js:28
    throw err;
    ^

Error: extra field length exceeds extra field buffer size
    at /private/tmp/yauzl-test/node_modules/yauzl/index.js:305:83
    at /private/tmp/yauzl-test/node_modules/yauzl/index.js:618:5
    at /private/tmp/yauzl-test/node_modules/fd-slicer/index.js:32:7
    at FSReqWrap.wrapper [as oncomplete] (fs.js:660:17)
aj@bowie /p/t/yauzl-test> node --version
v8.8.1

aj@bowie /p/t/yauzl-test> npm --version
5.4.2

I also tried rm -rf node_modules package-lock.json; npm install --save yauzl to make sure I got the latest versions.

@thejoshwolfe
Copy link
Owner

Ah. Sorry about that. I was so focused on the zip file that I completely neglected to look at your sample code. My bad.

I am able to reproduce the issue now. The video helped!

The problem is that you're using both lazyEntries:false and readEntry(). You probably forgot to put {lazyEntries:true} in your yauzl.open() call.

This is such a easy mistake to make that I'm adding a check for it to throw an exception rather than this "undefined behavior". This change should make this pitfall much more friendly to navigate. Here's an example of what you get with this change:

$ node asdf.js 
/home/josh/dev/yauzl/index.js:231
  if (!this.lazyEntries) throw new Error("readEntry() called without lazyEntries:true");
                         ^

Error: readEntry() called without lazyEntries:true
    at ZipFile.readEntry (/home/josh/dev/yauzl/index.js:231:32)
    at /home/josh/dev/yauzl/asdf.js:8:11
    at /home/josh/dev/yauzl/index.js:36:7
    at /home/josh/dev/yauzl/index.js:137:16
    at /home/josh/dev/yauzl/index.js:623:5
    at /home/josh/dev/yauzl/node_modules/fd-slicer/index.js:32:7
    at FSReqWrap.wrapper [as oncomplete] (fs.js:629:17)

And just for reference, the {lazyEntries:true} way is the recommended way, and the only reason it's not the default is to maintain compatibility with earlier versions of yauzl.

@coolaj86
Copy link
Author

Thank you very much. I can confirm that several files that threw errors before now parse as expected...

I'm not sure how I didn't add the lazyEntries: true before, but I'm just grateful for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants