Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.IO.Packaging: Avoid setting the general purpose bitflag due to enforced UTF8 encoding on ZipArchive creation #87883

Merged
merged 1 commit into from
Jun 21, 2023

Conversation

carlossanlop
Copy link
Member

@carlossanlop carlossanlop commented Jun 21, 2023

Fixes #87658

The ZipArchive constructor is being called by ZipPackage with the entryNameEncoding argument set to Encoding.UTF8. This is not needed by default, particularly because it's not respecting the Office XML standard which does not expect the general purpose bit flag 11 to be set. The entry name can still be stored using UTF8 if the archive encoding is unset or null, and the general purpose bit flag for entryname and comment encoding can still be changed later if the caller changes the entry name or the comment to a string with an encoding other than ASCII.

This change is relatively safe to make since ZipArchive is capable of handling archives without specifying a particular encoding.

This was verified with a manual test with @ericstj in a video call: we confirmed with reflection that the _generalPurposeBitFlag was now left unchanged by default, and an inspection of the generated file using a hex editor showed that the relevant bytes were unset.

We will add tests as a followup of the original issue. #87882

@ghost
Copy link

ghost commented Jun 21, 2023

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #87658

The ZipArchive constructor is being called by ZipPackage with the entryNameEncoding argument set to Encoding.UTF8. This is not needed by default, particularly because it's not respecting the Office XML standard which does not expect the general purpose bit flag 11 to be set. The entry name can still be stored using UTF8 if the archive encoding is unset or null, and the general purpose bit flag for entryname and comment encoding can still be changed later if the caller changes the entry name or the comment to a string with an encoding other than ASCII.

This change is relatively safe to make since ZipArchive is capable of handling archives without specifying a particular encoding.

This was verified with a manual test with @ericstj in a video call: we confirmed with reflection that the _generalPurposeBitFlag was now left unchanged by default, and an inspection of the generated file using a hex editor showed that the relevant bytes were unset.

We will add tests as a followup of the original issue.

Author: carlossanlop
Assignees: carlossanlop
Labels:

area-System.IO.Compression

Milestone: -

Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on manual testing. Approving this minimal change in order to get it out in P6 so that we can find out if it helps solve the issue.

@carlossanlop carlossanlop merged commit 499bdd9 into dotnet:main Jun 21, 2023
@carlossanlop carlossanlop deleted the PackagingArchiveEncoding branch June 21, 2023 21:52
@tomjebo
Copy link

tomjebo commented Jun 21, 2023

Thanks @carlossanlop and @ericstj!

@carlossanlop carlossanlop restored the PackagingArchiveEncoding branch June 22, 2023 20:31
@ghost ghost locked as resolved and limited conversation to collaborators Jul 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with zip headers general purpose flags bit 11 for OPC packages.
3 participants