-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: Use bytearray instead of b"" in encode_pdfdocencoding #2325
Conversation
Since b"" is not mutable it causes python to allocate and deallocate memory repeatedly in the for loop which cause hang/long runtime when handle very large string. For example when using add_js to to add a very big javascript code.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2325 +/- ##
=======================================
Coverage 94.37% 94.37%
=======================================
Files 43 43
Lines 7660 7660
Branches 1515 1515
=======================================
Hits 7229 7229
Misses 267 267
Partials 164 164 ☔ View full report in Codecov by Sentry. |
Could you please update the title to use the recommended naming scheme? https://pypdf.readthedocs.io/en/latest/dev/intro.html#commit-messages |
@zuypt Do you have an example that shows the difference? (It could be a toy-example - I'm just curious :-) ) |
I'm too lazy if some one have permission please help |
just create a PdfWriter then call add_js with a super large string you will see. This is a pretty common python programming error. |
I've already adjusted the title |
shows:
|
@zuypt Thanks for your contribution! If you want, I can add you to https://pypdf.readthedocs.io/en/latest/meta/CONTRIBUTORS.html |
It will be part of the next release on Sunday. |
sure. Thanks for the recognition |
## What's new ### Bug Fixes (BUG) - Cope with deflated images with CMYK Black Only (#2322) by @pubpub-zz - Handle indirect objects as parameters for CCITTFaxDecode (#2307) by @stefan6419846 - check words length in _cmap type1_alternative function (#2310) by @Takher ### Robustness (ROB) - Relax flate decoding for too many lookup values (#2331) by @stefan6419846 - Let _build_destination skip in case of missing /D key (#2018) by @nickryand ### Documentation (DOC) - Note in reading form data (#2338) by @MartinThoma - Pull Request prefixes and size by @MartinThoma - Add https://github.com/zuypt for #2325 as a contributor by @MartinThoma - Fix docstring for RunLengthDecode.decode (#2302) by @stefan6419846 ### Maintenance (MAINT) - Enable `disallow_any_generics` and add missing generics (#2278) by @nilehmann ### Testing (TST) - Centralize file downloads (#2324) by @MartinThoma ### Code Style (STY) - Fix typo "steam" \xe2\x86\x92 "stream" (#2327) by @stefan6419846 - Run black by @MartinThoma - Make Traceback in bug report template uppercase (#2304) by @stefan6419846 [Full Changelog](3.17.1...3.17.2)
Since b"" is not mutable it causes python to allocate and deallocate memory repeatedly in the for loop which cause hang/long runtime when handle very large string. For example when using add_js to to add a very big javascript code.