Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the correct fileName from extra filed when decodeStrings is false #113

Closed
wants to merge 2 commits into from

Conversation

fpsqdb
Copy link

@fpsqdb fpsqdb commented Nov 15, 2019

This commit add option decodeStrings adds support return fileName or comment as buffer and fixes issue #42, but the fileName is not a buffer when decodeStrings is false.
This PR makes fileName is a buffer when decodeStrings is false.

@thejoshwolfe
Copy link
Owner

Sorry for the delayed response. I'm not sure I understand the purpose or intended effect of this PR. Are you trying to bypass the security validation but still support reading the Info-ZIP Unicode Path Extra Field? If that's the case, what part of the validation is causing issues for you?

@fpsqdb
Copy link
Author

fpsqdb commented Feb 18, 2024

@thejoshwolfe Sorry, the commit link and related issue is wrrong, i have modified my comment.

@thejoshwolfe
Copy link
Owner

Why do you want an undecoded buffer for the file name?

@fpsqdb
Copy link
Author

fpsqdb commented Feb 19, 2024

Set decodeStrings to false to decode the buffer by myself.
And the code implementation does not match the document description.
https://github.com/thejoshwolfe/yauzl#filename

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String.

@thejoshwolfe
Copy link
Owner

I've just released yauzl 3.1.0, which includes support for decoding file names in UTF-8 without the safety validation. But it sounds like that's not actually what you're looking for.

It sounds like what you're looking for is:

  1. Ignore General Purpose Bit 11.
  2. Support finding the Info-ZIP Unicode Path Extra Field in the extra fields, and perform the version check and crc32 verification as required, but don't convert the Buffer into a string using UTF-8.
  3. return either the basic fileName as a Buffer or the override filename from the Info-ZIP Unicode Path Extra Field as a Buffer if present.

Is that what you want? If so, ... I'm very curious why. Have you found zip files using the Info-ZIP Unicode Path Extra Field that use an encoding other than UTF-8? Or are you curious what the bytes were before the UTF-8 decoding? If that's all you want, you should be able to simply re-encode the value into UTF-8 (UTF-8 is bijective for non-error code points).

In any case, what you're looking for can be accomplished by copying the logic in yauzl, which is now located in getFileNameLowLevel(). It's only about 30 lines of code.

Unless I can understand the use case for this PR, I can't properly support it.

@thejoshwolfe
Copy link
Owner

And the code implementation does not match the document description.

What's the discrepancy that you're seeing? If you're talking about how the undecoded Buffer is always the basic name and never the one from the Info-ZIP Unicode Path Extra Field, that's mentioned in the very next sentence in the docs. Maybe that could be communicated more clearly.

@fpsqdb
Copy link
Author

fpsqdb commented Feb 19, 2024

The latest version has fixed this problem

@fpsqdb fpsqdb closed this Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants