-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streams are cut off at Length which extracts incomplete files #809
Comments
Dears, I have the same issue and came to same conclusion. Any hope this can be fixed ? |
sbruyere
added a commit
to sbruyere/PdfPig
that referenced
this issue
May 21, 2024
…gth cutting off Streams - Fix of Stream invalid Length issue causing stream data being cut off: fix UglyToad#809 - Improve Stream Token read performance by: - simplifying TryReadStream(), avoiding use of MemoryStream, with benefice of already existing Memory Span of "inputBytes" - removing the unecessary List<>
BobLd
pushed a commit
that referenced
this issue
May 31, 2024
…gth cutting off Streams (#838) * Improve TryReadStream with simplification & fix of Stream Invalid Length cutting off Streams - Fix of Stream invalid Length issue causing stream data being cut off: fix #809 - Improve Stream Token read performance by: - simplifying TryReadStream(), avoiding use of MemoryStream, with benefice of already existing Memory Span of "inputBytes" - removing the unecessary List<> * Add Stream with Invalid Length unit test * Use of Memory<> instead of direct Span to avoid byte array allocation .ToArray. Suggestion from (https://github.com/UglyToad/PdfPig/pull/838/files/4153e4a1b421aee6158799175ced081c9f533a13#r1619509165)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I stumbled upon an issue while using PdfPig for extracting attachments from a PDF file. I have attached a sample PDF to reproduce the error:
0_ZUGFeRD.pdf
After debugging through the code, I found that the /Length attribute of the stream is set to a wrong value of 11417 bytes, however there is no endstream at the expected position. Instead, there are more bytes, followed by endstream.
PdfPig now seems to just cut all additional bytes off, which seems reasonable at first. However, other Pdf Libraries we tested can handle this pdf file just fine, and so can Adobe Acrobat Reader itself.
Therefore, this seems to be an issue with PdfPig in my oppinion.
The fix for me was to remove the if check in PdfTokenScanner.cs, Line 437
And just use the else block which tries to find endobj or endstream and cut off there - which works with the PDF I attached.
Another reason could be the special characters in the attachment, which might have lead to this Length attribute value.
Would this be a valid fix in your oppinion?
The text was updated successfully, but these errors were encountered: