- PDF Format Background
- Encryption Dictionaries
- Vulnerability Details
- ASLR and DEP Bypass
- Environment Details
- Trigger
- Author
PDF is a file format used to represent documents. A pdf is made of multiple data objects
-
Simple Primitive Objects Integer, Number, Boolean, Null
-
Complex Objects
Format Name [.*] Array (.*) String <<.*>> Dictionary <.*> Hex String /.* Name stream.*endstream Stream
These objects define how a pdf looks and what it contains. Structures in pdf are present in 2 types of objects - Direct and Indirect. An indirect object start with Object number and Generation number followed by the actual object. Indirect Objects can be directly referenced in other objects as n m R
where n and m are object and generation numbers respectively.
Dictionary objects are basic building blocks for the document. There are some general dictionary objects which are needed to form of page or the document itself. Most important is the Root
dictionary which defines links to all other Pages
, Metadata
, Names
etc. each of which can be some other object.
Stream objects contain the most binary data such as fonts, pictures or compressed/encrypted data.
A PDF document can be encrypted to protect its contents from unauthorized access. Encryption
applies to all strings and streams in the document's PDF file, with some exceptions such as the Encrypt
dictionary itself. Encryption mostly applies to stream objects. Encryption is not applied to other object types such as integers and boolean values, which are used primarily to convey information about the document's structure rather than its contents.
Encryption-related information shall be stored in a document’s encryption dictionary, which shall be the value of the "Encrypt" entry in the document’s trailer dictionary.
CPDF_Parser::StartParse
sets m_pCryptoHandler
for indirect objects of a pdf which are encrypted. m_pCryptoHandler
should be nulled out when CPDF_Parser::ReleaseEncryptHandler
is completed. Instead CPDF_Parser::ReleaseEncryptHandler
does not remove the reference to CryptoHandler in CPDF_Parser
and is dangling.
Later when the parser starts to parse the objects referenced in the Root dictionary, m_pCryptoHandler+8
is called to decrypt the data.
A similar bug was patched in pdfium in commit 741c362fb75fd8acd2ed2059c6e3e716a63a7ac8. See https://bugs.chromium.org/p/chromium/issues/detail?id=726503
PDFs allow embedding JS in the document which can be executed automatically if entered in OpenAction
of Catalog
type dictionary. Once we have JS execution we can spray objects in the process space so that we get to a predictable address where we'll write our ROP chain.
When a PDF document is signed in Foxit Reader, it uses plugins\jrsys\x86\jrsysMSCryptoDll.dll
from the installation directory to read the signed information which loads jrsysCryptoDll.dll on a static address of 0x10000000. This dll imports VirtualAlloc which makes it easier to execute payload. The attached exploit uses heap spraying to get a predictable memory layout and uses a rop chain for allocating an RWX page, copying and executing the payload.
This exploit was tested using Foxit Reader 9.0.1.1049 x86 running on MS Windows 7 Enterprise Build 7601 SP1 x86. The exploit requires the heap to be in a specific state, if the exploit fails, please try again. Please refer to the video demo. This vulnerability is also present in Foxit PDF Reader and Converter for Android too.
bitcoins.pdf is the crafted pdf that does the re-allocation of the freed memory and triggers the core bug. If you want to reproduce the crash in debugger, please enable Page Heaps for FoxitReader.exe and open bitcoins.pdf.
This crash was found by Cloudfuzz - A fuzzing platform developed at Payatu. Further analysis and exploitation was done by Sudhakar