Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request / bug] Resolution for masked generations #294

Closed
Pirog17000 opened this issue Jan 2, 2024 · 4 comments
Closed

[feature request / bug] Resolution for masked generations #294

Pirog17000 opened this issue Jan 2, 2024 · 4 comments

Comments

@Pirog17000
Copy link

Pirog17000 commented Jan 2, 2024

image
(1 - generation in masked area with 60% intensity; 2 - originally generated background)

1 - make canvas 640x640, generate image
2 - select the mask in way to make its borders have the same resolution - as '+' shape or somehow else
3 - generate with 50-65% intensity over generated image
4 - observe more blurred image

solution here is to ramp up the resolution before making over-generation with mask, but there's lack of such control.
what I'm asking here for is a slider, like for upscale mode, which will have couple positions for generation:

auto (as now) - native resolution (as selected) - x2 resolution of selection - x3 resolution

this slider can be placed as optional toggleable in settings under main intensity slider. and it will help a ton!

@Acly
Copy link
Owner

Acly commented Jan 2, 2024

Have you tried this "solution"? It's easy to test, just scale image 2x before step (3) and scale back to original after.

auto (as now) - native resolution (as selected)

Not sure what you mean - they are the same in this case.


I can see you get blurred results, but I cannot reproduce this with your steps, and also don't understand why you think this is a resolution issue.

If the original image was generated at 640 resolution and doesn't come out blurry, then 640 is clearly good enough to get sharp results. Why should this change when you do masked img2img? Why do you think increased resolution will fix it?

This is essentially the same as #223, and I'm working on some resolution related stuff at the moment. But running generation at x2/x3 - with 4 times / 9 times higher cost! - to then throw away most of the work and detail by immediately downscaling seems extremely wasteful. If you want the quality & detail, doesn't it make more sense to increase the canvas size and actually keep the high resolution results? You can still do a destructive downscale when you export the final image for web, but keep the high resolution version as working file.

The other way round makes more sense to me, you have a high-resolution canvas, but it takes too long / too much VRAM to do generation at full resolution. In that case automatic upscaling might be nice.

@Pirog17000
Copy link
Author

Thank you for your questions!

If the original image was generated at 640 resolution and doesn't come out blurry, then 640 is clearly good enough to get sharp results. Why should this change when you do masked img2img? Why do you think increased resolution will fix it?

I believe the original image generates in 1024x1024 and downscales to 640x640 because it's SDXL and it works this way. Possibly.
The clarity grows as I ramp the resolution of the canvas, drastically grows, this is why I'm asking for this optional toggleable feature.

running generation at x2/x3 - with 4 times / 9 times higher cost! - to then throw away most of the work and detail by immediately downscaling seems extremely wasteful.

Yes, it draws more memory and time, but for sharp details and slow-paced generation process it is something I'm ready to accept easily. Native resolution doesn't provide as much details anyway.

If you want the quality & detail, doesn't it make more sense to increase the canvas size and actually keep the high resolution results?

Increasing resolution still gives the same paradox - it's 1:1 scale, which will be capped with the details amount by the nature of checkpoint used and working detalization for 1:1 scale and resolution. Something I want to overcome and improve with this feature.
@Acly

@Acly
Copy link
Owner

Acly commented Jan 2, 2024

I believe the original image generates in 1024x1024 and downscales to 640x640 because it's SDXL

That explains why it ended up blurred, and actually demonstrates why the "feature" you are resquesting is a bad idea. It's important to realize that the canvas size is always a hard limit on the quality you can achieve when you continuously work on an image. What happens in your scenario is:

  1. Initial image is generated at 1024 and downscaled to 640 - it looks sharp/detailed, but the quality of the original generation was lost and cannot be recovered.
  2. You do a 60% refine on the image. It will do a (bilinear) upscale to 1024 (pretty much exactly as you request!) - this results in blurred input because it's only an upscale of 640px.
  3. SDXL runs on the blurred input at 1024px and probably assumes the blurriness is desired (many photos have intentional blur)
  4. The result is downscaled to fit into your 640 canvas. It doesn't match the previous generation which was done at 1024 without blurred input (because it was initial gen) and looks bad in comparison.

So in a way you already get the behavior you ask for, but it does not lead to the results you imagine - rather the opposite. The actual solution here is simple: use a 1024 (or more) canvas and you won't have this issue.

Yes, it draws more memory and time, but for sharp details and slow-paced generation process it is something I'm ready to accept easily

I get that. What I don't understand is why you want to deliberately throw away the majority of that generation process and choose to view and (what's worse) continue to work on a downscaled version!

Increasing resolution still gives the same paradox - it's 1:1 scale, which will be capped with the details amount by the nature of checkpoint used and working detalization for 1:1 scale and resolution.

I have no idea where this idea is coming from. This is not how it works. The checkpoint does not scale to whatever resolution you pass in. It has only the resolution it was trained on, and sort of tiles/repeats outside of it.

It sounds like you think there is some kind of limitation that may be overcome with this feature as proposed. But I don't see how this is possible, you can 100% already do what you ask for manually by scaling the image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@Pirog17000 @Acly and others