Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent an infinite loop in XRef_fetchUncompressed for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665) #7668

Merged
merged 2 commits into from
Oct 15, 2016

Conversation

Snuffleupagus
Copy link
Collaborator

Note that prior to commit 623fa29, which was part of PR #4971, the PDF file "worked" since the error was caught. However, we've never had code that attempts to handle the (fortunately rare) underlying problem of indirect objects in /Encrypt dictionaries, which this patch attempts to address.

This is a somewhat tentative patch, which fixes #7665.

@Snuffleupagus
Copy link
Collaborator Author

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/7021b14ce999e59/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/7021b14ce999e59/output.txt

Total script time: 1.02 mins

Published

@@ -2013,7 +2014,15 @@ var CipherTransformFactory = (function CipherTransformFactoryClosure() {
this.encryptionKey = encryptionKey;

if (algorithm >= 4) {
this.cf = dict.get('CF');
var cf = dict.get('CF');
if (isDict(cf)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about how this check indicates that the CF dictionary is encrypted, as the comment below states. We're just checking if it's a dictionary, right, not that it's encrypted, or am I missing something here?

Copy link
Collaborator Author

@Snuffleupagus Snuffleupagus Sep 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help to open the PDF file from the issue with http://brendandahl.github.io/pdf.js.utils/browser/ and have that as a reference when reading my comment below :-)

Note that the Encrypt dict (when it's present) is a top-level item, contained in the Trailer dict, of a PDF file. The entries inside of the Encrypt dict should not be encrypted, which thus explains the changes this patch makes in src/core/obj.js.
Now depending on the type of encryption used by the PDF file, the Encrypt dict may (as is the case here) have a CF entry (which is itself a dict). If you trace through the code, you'll see that the dict variable in the code you commented on above is referring to the Encrypt dict.
Hence we already know that we're inside of the Encrypt dict here, and the isDict check above is only necessary in order to prevent an error in the case where a particular encryption has no CF entry.

It's actually the CF dict that is the main source of the (current) problem in the code, since the lookup in https://github.com/mozilla/pdf.js/blob/master/src/core/crypto.js#L2044 is triggering an infinite loop. Hence why my patch tries to ensure that the Encrypt dict, and the CF dict within, isn't trying to needlessly decrypt something that the PDF spec says should not be encrypted in the first place.

Note that the reason this hasn't been an issue before is that this particular PDF file has indirect objects in the Encrypt dict, something the PDF spec seem to suggest should not actually happen.

A bit long-winded perhaps, but I hope it makes sense!

@yurydelendik yurydelendik self-assigned this Sep 28, 2016
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/8b3eab2f13e66a8/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/85a2e29d73a7840/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://107.22.172.223:8877/8b3eab2f13e66a8/output.txt

Total script time: 25.67 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.22.172.223:8877/8b3eab2f13e66a8/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/85a2e29d73a7840/output.txt

Total script time: 29.61 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/85a2e29d73a7840/reftest-analyzer.html#web=eq.log

@yurydelendik yurydelendik merged commit ea5949f into mozilla:master Oct 15, 2016
@yurydelendik
Copy link
Contributor

Thank you for the patch

@Snuffleupagus Snuffleupagus deleted the issue-7665 branch October 15, 2016 16:28
movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018
Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Maximum call stack size exceeded
4 participants