Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library not available: "Cannot close object, library is destroyed..." #281

Closed
2 tasks done
sidharthrajaram opened this issue Nov 23, 2023 · 4 comments · Fixed by #282
Closed
2 tasks done

Library not available: "Cannot close object, library is destroyed..." #281

sidharthrajaram opened this issue Nov 23, 2023 · 4 comments · Fixed by #282

Comments

@sidharthrajaram
Copy link

Package origin

  • I confirm to be using an official package of pypdfium2 from PyPI or GitHub/pypdfium2-team.

Description

I'm using Nougat OCR, which makes use of pypdfium2. However, during calls to that function, a warning message is outputted followed by a crash. The warning message reads:

-> Cannot close object, library is destroyed. This may cause a memory leak!

In the pypdfium2 code, this message seems to be printed when the library is not available. The possible memory leak that it is mentioning is probably what's leading to the aforementioned crash of Nougat OCR.

For that reason, under what circumstances will this message be printed? What does it mean for the library to not be available?

Install Info

pypdfium2 4.24.0
pdfium 121.0.6110.0

Python 3.10.13 (main, Aug 25 2023, 13:20:03) [GCC 9.4.0]

Linux-5.15.0-67-generic-x86_64-with-glibc2.31

Name: pypdfium2
Version: 4.24.0
Summary: Python bindings to PDFium
Home-page: https://github.com/pypdfium2-team/pypdfium2
Author: pypdfium2-team
Author-email: geisserml@gmail.com
License: (Apache-2.0 OR BSD-3-Clause) AND LicenseRef-PdfiumThirdParty
Location: /home/ubuntu/extraction/frontieralpha/lib/python3.10/site-packages
Requires: 
Required-by: nougat-ocr

Validity

  • I confirm that I ran all commands, and pasted the whole output.
@mara004
Copy link
Member

mara004 commented Nov 23, 2023

See facebookresearch/nougat#162 (comment) and facebookresearch/nougat#110 (comment)

the pypdfium2 code, this message seems to be printed when the library is not available.

pdfium needs to be initialized before using any of its APIs, and destroyed when finished. This error means pdfium for some reason was destroyed while there were still live objects. The crash then might be due to use after free, or just memory overload or something.

The odd thing is that pdfium is destroyed by an exit handler (atexit.register()), which normally should be called only after all objects have been finalized and the program is about to terminate.
I believe this might have some obscure multiprocessing reasons, perhaps related to the fork context?


Anyway, as said in the other threads, nougat should just use linear rendering, then everything should work fine.

@mara004
Copy link
Member

mara004 commented Nov 23, 2023

What worsens the problem is that nougat is doing PNG saving, which is slow, so lots of transferred bitmaps produced by the pool implicitly queue up in memory over time, which can cause enormously high memory loads.
Again, the solution is to avoid pypdfium2's deprecated multi-page renderer.

FWIW I think this warning itself may not be that significant (I doubt the leaked object handles are the main issue here), it's just a symptom of a big concept mistake rooted in the deprecated API.

@mara004 mara004 linked a pull request Nov 24, 2023 that will close this issue
@mara004
Copy link
Member

mara004 commented Dec 7, 2023

@sidharthrajaram I just merged #282 via 6024f15, which replaces PdfDocument.render() with a linear version, to do our share to contain the damage. This means, updating to the next release (scheduled for 2023-12-10) should tackle the issue for v4-compatible dependants such as nougat, even without downstream action.

See also the upcoming changelog for a writeup to outline the problem.

@sidharthrajaram
Copy link
Author

Hey @mara004 , thanks for the help and the great notes on the issues across this repo and on the nougat repo. Page-level linear rendering has been working fine since the initial change, but that's good to know this fix was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants