-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gatsby-plugin-sharp] 'Generating image thumbnails' takes a long time. Option to skip transformations #25827
Comments
We already get image compression and resizing via sharp, layering another alternative into the mix doesn't bring any value? In fact we recently fixed a bug with webp, since someone slipped in an additional compression pass on webp where the quality setting would compound (
Only smaller on disk, images need to decompress to memory for actual image operations afaik, thus the number of pixels will use the same amount of memory as those that have reduced colour/quality from quantization and such techniques. Better to reduce pixel count though. I had 150MB for ~20 images from unsplash, I reduced them down to 2k at most along with image compression on-disk size was down to 6MB. That will read into memory faster, but the processing will be faster due to being 25% or less the original image size in pixels(which is also ~40MB no compression).
Warning about imgix, Unsplash apparently uses them and refers to their API for image transformations. When I used that (via Unsplash) I noticed images would sometimes lose tone mapping(or looked like that), in that the warm tones were swapped for cold ones, I guess there either was a bug or setup issue that stripped away some information of the image.. I did not encounter that when I rewrote the transformations via a small sharp js script to pre-process locally.
Probably a good way to go, and iirc Gatsby Cloud leverages offloading sharp processing to Google Cloud functions, presumably with some sort of caching system. Not sure what performance issues you're referring to, maintenance can be less of a burden if it's a community maintained open-source project, I'm sure that something must exist out there, just might need a gatsby plugin for smooth build process? Personally, while
Seems like a good approach to me, but not sure if you can just return the src, what about base64 placeholders(or others like SVG?), image sizes( Might want to choose a better name like
I think this issue is pretty much the same problem discussed in this issue. Perhaps chime in there with your suggestion to just return the image src and skip processing. |
@polarathene thank you for your thoughtful feedback on this. much appreciated!
What I mean is to take original image assets and compress those through the tool. Gatsby does compression and resizing through
That's a fair point. I think if an image is let's say 10 MB, then I'd resize it rather than try to compress it
I'm with you. I think by saying to compress the image I had in mind both the pixel count as well as its weight. Bad choice of terms on my part.
I think the overall benefit of taking image processing to an external service would outweigh an occasional bug, such as this. For them it's their bread & butter, so I hope I'm not naive in thinking that imgix would be on top of such issues.
What I had in mind is caching and load speed. However, this may well be unfounded. Would have to spike something like this to understand the up and downsides of this approach.
That's certainly something to investigate, but I wonder if the overhead of this sort of implementation would be equal to other alternatives, such as imgix or serverless functions for image sourcing and transformation
I haven't considered this for every use case. I went through the main script file that does a lot and I think that one would have to choose an if/else escape hatch carefully, so not to have to place it in 20 locations. I feel that this flag should just return something like:
And that's it. So no fancy loading or anything like that. It's just a way to have your application spin up fast. It doesn't seem like a big price to pay compare to the alternative that would need a bunch of GraphQL query changes. You kind of what this flag to do the magic for you and work out of the box. And I certainly wouldn't recommend this as a default as this is a problem once you get to larger scale projects.
Naming is hard and something like
This is definitely a more elegant suggestion - totally different cache for images. I like this even better than
Thanks for the link, I'll do that 👍 |
Yes, but this would suffer the same issue What you want to do is pre-process your images, there is a discussion for documenting this for the community as a guide here, although it seems to be stalled. I have an example script here, which has my source images committed to the repo via git-lfs, I can tweak the script and always have the original source available in case I want to change things such as crop or dimensions. Then the output I commit also via git-lfs in another directory where my Gatsby project will have That reduced ~150MB of image data down to 6MB. Netlify and Vercel cannot use git-lfs, and afaik Netlify Large Media alternative feature isn't available during builds, so I have Cloudinary as an image host that a separate branch of that project uses with a remote images plugin. Absolutely, you should pre-process the original images to optimize for I think the bigger issue you're wanting to work around though is the unnecessary cache flushing which leads to
They're problems that need improvements, I agree :)
That depends on where it's being used, on a local machine for development, you'd not have to worry about network traffic which can be problematic for some due to poor connections which can not only be slow, but also result in failed transfers, which with Gatsby's current handling seems to result in clearing the cache and starting again just because one of N images failed but was considered successful because the partial image download was still a processable image, even if it's missing the bottom half of pixels.. Serverless functions are nice, but in some situations may run over the free tier limits and incur costs, if there's no cache layer there you end up paying for processing that you shouldn't need to as well. Image API services are good if the network connection is reliable and the service meets your needs, they've got similar drawbacks to serverless functions. Few offer compatibility with gatsby-image, requiring you to do more DIY work or collaborate on a new plugin to build the fluid/fixed data objects that gatsby-image expects, otherwise you're just remote downloading the images and still processing with
Well, if it's effectively an img element instead of leveraging the proper gatsby-image component, that's going to create potential surprises during development/deploy unless another step is used to test that, users already raise issues with disparity of development mode with SSR at deploy causing React hydration surprises. I would suggest better resolving the cache issue, then everything works well if an initial processing stage is acceptable. In the linked related issue, that's not for some users, so a better solution for them would probably to make the image handling difference more clear by providing some mock image data, or if that is a problem, just supplying the same image URI of the source image for each size, and some generic base64 placeholder(or omit that entirely).
I don't see Alternatively, you could wrap the |
Thanks for the issue and discussion so far, but this is a duplicate of #24822 and thus I'll close this. Please leaver your comments there, thanks! |
this is pretty interesting. I wasn't even aware of Git Large File Storage
Absolutely agree with you 👍
That's certainly a drawback. This solution would create divergence between production and development
This would be ideal - separate caching layer for images that are so expensive to have. Having said that, people who have tens of thousands of images that take hours to resize, would not benefit from this as on occasion cache would need to be flushed, and waiting for hours on end is a non-starter
Not only that, but this would need changing GraphQL queries as well |
I'm looking for ways to improve the build times of my e-commerce store. Currently takes around 35 mins for my build to complete on an AWS general1.medium | 7 GB Memory | 4 vCPU. The largest chunk of time is |
Summary
For a while I've been working on a project that has a lot of images.
One of the biggest pain points is seeing 'Generating image thumbnails' in my terminal.
It takes 12 minutes for a build on my project to complete when Gatsby flushes the cache.
The cache is flushed:
I bumped Gatsby version in the project to take advantage of Jobs API V2, which runs 'Generating image thumbnails' in parallel with page queries. That barely makes a dent in build times.
I considered other options:
moving image resizing out to a third-party service, such as imgix. Then I could remove
gatsby-plugin-sharp
, append transformations on my image URLs and get transformed images this way. The down side - additional costs and complexity in setupusing
serverless-sharp
. Again, I could removegatsby-plugin-sharp
and then rely on AWS Lambda to handle image requests, run transformations on them and cache them. It's kind of like do-it-yourself imgix. The downside is all the problems associated with performance, maintenance and a bunch of other things I can't even think ofusing a CMS that has imgix integration. However, that's one of those high-effort and high-impact options. Maybe one day
Add another dependency ImageOptim-CLI (see https://github.com/JamieMason/ImageOptim-CLI) to squeeze any unresized/uncompressed original images. Then run this once for all images and add a background task to resize and compress any newly added images. This way, the committed images would get squashed, so every time an image has to go through the resizing process, the number of bytes going through the image resizing
sharp
library, would be smaller. The downside is that this is like an improvement rather than a solution to a problemconditionally setting different
gatsby-config
, so that inNODE_ENV=development
I don't havegatsby-plugin-sharp
, whereas inNODE_ENV=production
I do havegatsby-plugin-sharp
. The downside is that this would would require changes to every GraphQL query:The above query would have to change to return just
src
indevelopment
and the current version inproduction
. This could be done via GraphQL fragments and conditional queries. However, this is quite ugly and introduces divergence in the codeAnd then every place that consumes it:
<Img fluid={data.something.childImageSharp.fluid} />
Would have to be handled by some sort of function that returns
src
indevelopment
and the current implementation inproduction
.This seems like a lot of overhead just to bypass time consuming image resizing
gatsby-plugin-sharp
to return original image src without doing the time consuming resizing when indevelopment
. This way my earlier mentioned GraphQL would still work, but every value would be the same unresized image src. It's a sort of by-pass without breaking the application.It could look something like this:
I'd happily invest some of my time to work on option
6
, but first I'd like to hear Gatsby contributors thoughts on this issues as I don't know:I look forward to hearing your thoughts on this
Basic example
Motivation
To speed up developer experience when running gatsby locally
The text was updated successfully, but these errors were encountered: