Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max 1GB PDF output size re 1GB nodejs buffer limit #778

Closed
jakeg opened this issue Jun 7, 2016 · 12 comments
Closed

Max 1GB PDF output size re 1GB nodejs buffer limit #778

jakeg opened this issue Jun 7, 2016 · 12 comments

Comments

@jakeg
Copy link
Contributor

jakeg commented Jun 7, 2016

We're creating PDFs with node-canvas and I've found the following error when generating particularly large files:

FATAL ERROR: v8::Object::SetIndexedPropertiesToExternalArrayData() length exceeds max acceptable value
Aborted (core dumped)

I think this is because the PDF buffer generated from canvas.toBuffer() exceeds 1GB as per:
http://stackoverflow.com/questions/8974375/whats-the-maximum-size-of-a-node-js-buffer

This seems quite likely, as it's a 120 page PDF with multiple high-resolution images in it!

If this is the problem, is there a way to stream the PDF output instead, increase node's 1GB limit, or any other potential solutions?

@zbjornson
Copy link
Collaborator

The 1 GB limit has thankfully been remove quite a while ago. Which version of node are you using? Are you able to try a newer version?

@jakeg
Copy link
Contributor Author

jakeg commented Jun 7, 2016

Ahh, was this removed from nodejs or node-canvas (I'm presuming the former)? Do you know which version got the bump? I'm on an old version of both as an upgrade right now is rather risky in our business calendar :/

@jakeg
Copy link
Contributor Author

jakeg commented Jun 7, 2016

v0.10.40 at present unfortunately

@zbjornson
Copy link
Collaborator

zbjornson commented Jun 7, 2016

It was removed from node (v8 rather) maybe around 0.12. In a bit I can find the exact version. (Roughly one year ago.)

I haven't used the PDF part of this lib, but I don't remember seeing a streaming API for PDFs. Worst case perhaps you could fork a child process to run newer node just for PDF rendering?

@zbjornson
Copy link
Collaborator

Looks like it was in v3.0 actually, when v8 4.4 was introduced. https://github.com/nodejs/node/blob/v3.0.0/CHANGELOG.md

@jakeg
Copy link
Contributor Author

jakeg commented Jun 7, 2016

Thanks Zach, just saw that myself too. I'm going the whole hog and upgrading to node v4.4.5 and am currently going through all dependencies etc and getting them up to date in my project. Quite a process! Fingers crossed this actually solves my original problem.

@jakeg
Copy link
Contributor Author

jakeg commented Jun 8, 2016

Managed to update to v4.4.5. Still getting an error, but wording is slightly different now:

node: ../node_modules/nan/nan.h:695: Nan::MaybeLocal<v8::Object> Nan::CopyBuffer(const char*, uint32_t): Assertion `size <= imp::kMaxLength && "too large buffer"' failed.
Aborted (core dumped)

@zbjornson
Copy link
Collaborator

Hrm, that's sort of a bug in nan. For recent versions of v8 on x64 platforms it should be 0x7FFF FFFF (i.e. approx 2 GB). (I just asked about a possible solution: nodejs/nan#573.) You could tweak that line in your nan.h file and rebuild, if 2 GB is sufficient.

Also ... my apologies, I mixed up the total node memory limit, ArrayBuffers and Buffers. The total memory limit and ArrayBuffers can now be quite large; Buffers and TypedArrays are still limited to ~2 GB. (v8 bug, v8 bug) Sorry to send you on a wild goose chase. :(

Could you manually "stream" this by doing a page at a time?

@jakeg
Copy link
Contributor Author

jakeg commented Jun 9, 2016

Could you manually "stream" this by doing a page at a time?

Ideally not. Say we have 120 pages and all use the same fonts and the same background image, if we did 10 pages at a time into separate PDFs then each one would have to embed the fonts and include the background image, whilst if just outputting a single PDF the image and fonts would only be inserted once into the whole document. It's possible that when joining the PDFs back together that the program doing so intelligently removes the duplications, I'm not sure. I presume this is what you meant by manually "stream"?

Looking at nodejs/nan#573 it looks like @kkoopa sees the problem with node-canvas rather than nan. I confess to knowing almost nothing about streaming, Buffers or TypedArrays but is there a way node-canvas can be changed for PDFs so that a stream is available? I'm happy to pay for such work but don't have the skills or know-how to attempt it myself. Each year we have at least one or two books over 1GB so would be very thankful for a solution to this (yesterday I ended up outputting the book in two halves manually instead).

@LinusU
Copy link
Collaborator

LinusU commented Jun 9, 2016

It would be awesome if we could stream it! I don't know what APIs we have to work with though, it could be very easy to implement, or it could require us to change pdf rendering engine... I'll have a look when I get of work 👍

@kkoopa
Copy link
Contributor

kkoopa commented Jun 9, 2016

https://cairographics.org/manual/cairo-PDF-Surfaces.html#cairo-pdf-surface-create-for-stream

On June 9, 2016 12:15:13 PM GMT+03:00, "Linus Unnebäck" notifications@github.com wrote:

It would be awesome if we could stream it! I don't know what APIs we
have to work with though, it could be very easy to implement, or it
could require us to change pdf rendering engine... I'll have a look
when I get of work 👍


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#778 (comment)

@kkoopa
Copy link
Contributor

kkoopa commented Jun 9, 2016

I made a crude reference implementation.
kkoopa@5353045

It works, but it still keeps the entire PDF in memory, so it is less than ideal. However, refining that would require more drastic rewrites of the library. However, all of it is not kept in a Buffer, so it avoids that problem. It also avoids unnecessary copying, contrary to the other streams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants