Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DocumentFormatException: Expected name as dictionary key, instead got: Collaborative #791

Closed
kirk-marple opened this issue Mar 9, 2024 · 1 comment
Assignees
Labels
document-reading Related to reading documents

Comments

@kirk-marple
Copy link

kirk-marple commented Mar 9, 2024

Found a file, which throws a DocumentFormatException. Found it in some sample documents, and it's known to cause issues, but thought it may be useful to fix since I have lenient parsing enabled.

Got the error:
Expected name as dictionary key, instead got: Collaborative

Using Nuget:

Here's the file which doesn't open correctly:
invalid-pdf-structure-pdfminer-entire-doc.pdf

UglyToad.PdfPig.Core.PdfDocumentFormatException:
   at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.ConvertToDictionary (UglyToad.PdfPig.Tokenization, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.TryTokenizeInternal (UglyToad.PdfPig.Tokenization, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Tokenization.DictionaryTokenizer.TryTokenize (UglyToad.PdfPig.Tokenization, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Tokenization.Scanner.CoreTokenScanner.MoveNext (UglyToad.PdfPig.Tokenization, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Tokenization.Scanner.PdfTokenScanner.MoveNext (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Tokenization.Scanner.PdfTokenScanner.Get (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Parser.Parts.DirectObjectFinder.Get (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Parser.DocumentInformationFactory.Create (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.OpenDocument (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.Parser.PdfDocumentFactory.Open (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
   at UglyToad.PdfPig.PdfDocument.Open (UglyToad.PdfPig, Version=0.1.9.0, Culture=neutral, PublicKeyToken=605d367334e74123)
BobLd added a commit to BobLd/PdfPig that referenced this issue Mar 10, 2024
@BobLd
Copy link
Collaborator

BobLd commented Mar 10, 2024

@kirk-marple thanks a lot for providing the sample pdf - the fix PR is created. Allows to not throw exception when lenient parsing is enabled

@BobLd BobLd self-assigned this Mar 10, 2024
@BobLd BobLd added the document-reading Related to reading documents label Mar 10, 2024
@BobLd BobLd closed this as completed in acfe8b5 Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
document-reading Related to reading documents
Projects
None yet
Development

No branches or pull requests

2 participants