-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent an infinite loop in XRef_fetchUncompressed
for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)
#7668
Conversation
/botio-linux preview |
From: Bot.io (Linux)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/7021b14ce999e59/output.txt |
From: Bot.io (Linux)SuccessFull output at http://107.21.233.14:8877/7021b14ce999e59/output.txt Total script time: 1.02 mins Published |
@@ -2013,7 +2014,15 @@ var CipherTransformFactory = (function CipherTransformFactoryClosure() { | |||
this.encryptionKey = encryptionKey; | |||
|
|||
if (algorithm >= 4) { | |||
this.cf = dict.get('CF'); | |||
var cf = dict.get('CF'); | |||
if (isDict(cf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about how this check indicates that the CF
dictionary is encrypted, as the comment below states. We're just checking if it's a dictionary, right, not that it's encrypted, or am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might help to open the PDF file from the issue with http://brendandahl.github.io/pdf.js.utils/browser/ and have that as a reference when reading my comment below :-)
Note that the Encrypt
dict (when it's present) is a top-level item, contained in the Trailer
dict, of a PDF file. The entries inside of the Encrypt
dict should not be encrypted, which thus explains the changes this patch makes in src/core/obj.js
.
Now depending on the type of encryption used by the PDF file, the Encrypt
dict may (as is the case here) have a CF
entry (which is itself a dict). If you trace through the code, you'll see that the dict
variable in the code you commented on above is referring to the Encrypt
dict.
Hence we already know that we're inside of the Encrypt
dict here, and the isDict
check above is only necessary in order to prevent an error in the case where a particular encryption has no CF
entry.
It's actually the CF
dict that is the main source of the (current) problem in the code, since the lookup in https://github.com/mozilla/pdf.js/blob/master/src/core/crypto.js#L2044 is triggering an infinite loop. Hence why my patch tries to ensure that the Encrypt
dict, and the CF
dict within, isn't trying to needlessly decrypt something that the PDF spec says should not be encrypted in the first place.
Note that the reason this hasn't been an issue before is that this particular PDF file has indirect objects in the Encrypt
dict, something the PDF spec seem to suggest should not actually happen.
A bit long-winded perhaps, but I hope it makes sense!
…F files with indirect objects in the /Encrypt dictionary (issue 7665)
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.22.172.223:8877/8b3eab2f13e66a8/output.txt |
From: Bot.io (Linux)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://107.21.233.14:8877/85a2e29d73a7840/output.txt |
From: Bot.io (Windows)FailedFull output at http://107.22.172.223:8877/8b3eab2f13e66a8/output.txt Total script time: 25.67 mins
Image differences available at: http://107.22.172.223:8877/8b3eab2f13e66a8/reftest-analyzer.html#web=eq.log |
From: Bot.io (Linux)FailedFull output at http://107.21.233.14:8877/85a2e29d73a7840/output.txt Total script time: 29.61 mins
Image differences available at: http://107.21.233.14:8877/85a2e29d73a7840/reftest-analyzer.html#web=eq.log |
Thank you for the patch |
Prevent an infinite loop in `XRef_fetchUncompressed` for encrypted PDF files with indirect objects in the /Encrypt dictionary (issue 7665)
Note that prior to commit 623fa29, which was part of PR #4971, the PDF file "worked" since the error was caught. However, we've never had code that attempts to handle the (fortunately rare) underlying problem of indirect objects in /Encrypt dictionaries, which this patch attempts to address.
This is a somewhat tentative patch, which fixes #7665.