Scene-to-Image Prompt Layering System #1179

xaedes · 2022-09-16T15:50:24Z

Summary of the change

new Scene-to-Image tab
new scn2img function
functions for loading and running monocular_depth_estimation with tensorflow

Description

(relevant motivation, which issue is fixed)

Related to discussion #925

Would it be possible to have a layers system where we could do have foreground, mid, and background objects which relate to one another and share the style? So we could say generate a landscape, one another layer generate a castle, and on another layer generate a crowd of people.

To make this work I made a prompt-based layering system in a new "Scene-to-Image" tab.
You write a a multi-line prompt that looks like markdown, where each section declares one layer.
It is hierarchical, so each layer can have their own child layers.

Examples: https://imgur.com/a/eUxd5qn

In the frontend you can find a brief documentation for the syntax, examples and reference for the various arguments.

Here a short summary:

Sections with "prompt" and child layers are img2img, without child layers they are txt2img.
Without "prompt" they are just images, useful for mask selection, image composition, etc.
Images can be initialized with "color", resized with "resize" and their position specified with "pos".
Rotation and rotation center are "rotation" and "center".

Mask can automatically be selected by color or by estimated depth based on https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter.

Additional dependencies that are required for this change

For mask selection by monocular depth estimation tensorflow is required and the model must be cloned to ./src/monocular_depth_estimation/
Changes in environment.yaml:

einops>=0.3.0
tensorflow>=2.10.0

Einops must be allowed to be newer for tensorflow to work.

Checklist:

I have changed the base branch to dev
I have performed a self-review of my own code
I have commented my code in hard-to-understand areas
I have made corresponding changes to the documentation

…ing Sygil-Dev#947. Also fixes bug in image display when using masked image restoration with RealESRGAN. When the image is upscaled using RealESRGAN the image restoration can not use the original image because it has wrong resolution. In this case the image restoration will restore the non-regenerated parts of the image with an RealESRGAN upscaled version of the original input image. Modifications from GFPGAN or color correction in (un)masked parts are also restored to the original image by mask blending.

Scene is going to be described by a prompt scene description language. Prompts and image generation parameters can be specified in code. Scene is based on image layers that are blended to a final image. Layers can be recursive. Layer content can be generated from txt2img prompts, or other user content. Layer blending can be common blending methods (like additive, multiplicative, etc.) and (optionally) together with img2img. Other user content could be retrieved from urls or we could provide a few code accessible gradio image slots where use can upload and manipulate images.

# Conflicts: # scripts/webui.py

this allows deep trees while avoiding growing number of '#' by just using '#> example title' headings. sections after this will have their depth bumped by number matched '>'.

… size it uses default size from img2img_defaults

now allows the common pattern where you can toggle which of two sections should be out-commented. Example "Section A" is active, "Section B" is in comment: //* Section A /*/ Section B //*/ Example "Section B" is active, "Section A" is in comment: /* Section A /*/ Section B //*/

cache is stored in ram not in vram. it will get automatically cleared when generating with different seed in frontend.

this allows automatically mask objects, foreground or background in generated images. depth image is estimated from images using monocular-depth-estimation: https://huggingface.co/keras-io/monocular-depth-estimation https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter

headings in format "#< title" will reduce following section depth by number of "<" useful to get out of high depth bump due to usage of "#> title". bug fix in scn2img function argument definitions

- fix missing intermediates output - variation=0 will not modify seed - improved caching by using hash function with explicit contents instead of relying on str(obj) as hash - move function kwarg preparation out of render functions into seperate functions

`None` attribute key is excluded from hash because it only contains free text (like newlines) of the corresponding scn2img-prompt section

seed generators can be initialized with "initial_seed" and will be used for this section and its children. they can recursively used, i.e. children can do the same and the seed generator will only apply to them and their children. the first seed generator is initialized with the seed argument supplied to scn2img function. this makes it possible to save initial seeds in scn2img-prompt and reuse prompts with (nearly) deterministic results.

…d of print

…level to console

… mask value mask generation will now also work directly on txt2img and img2img output (previously only in image sections). new options in sections: - mask_invert - mask_value - mask_by_color # reference color as RGB tuple. sets mask to 0 where image color is too different - mask_by_color_space # specifies in which colorspace to use for selection (defaults to LAB) - mask_by_color_threshold # threshold to use for color selection (defaults to 15)

…mated-depth

…morphological operations new arguments: - mask_by_color_at - mask_open - mask_close - mask_grow - mask_shrink - mask_blur

…uest easier

tensorflow is for depth from image estimation, used for automatic mask selection in scn2img. tensorflow needs einops>=0.3.0 otherwise, "AttributeError: module 'keras.backend' has no attribute 'is_tensor'" is thrown when loading keras model.

now uses the same code for parameter copy as txt2img and scn2img.

scn2img needs some functions,variables,etc. from webui.py. to avoid circular import, the scn2img function is wrapped with get_scn2img. it takes the necessary variables as arguments and returns a usable scn2img function. monocular depth estimation model loading is now delayed until it is actually used by used scn2img.

Gives better results for depth most of the time, so it is used by default. adds new dependency "timm" of (Unofficial) PyTorch Image Models, which is used by MiDaS add new scn2img arguments: - mask_depth_normalize: bool // normalizes depth between 0 and 1 (by default on) - mask_is_depth : bool // directly use the depth as mask, rescaling depth at mask_depth_min to mask=0 and mask_depth_max to mask=255 - mask_depth_model: int // 0=monocular depth estimation, 1=MiDaS (default)

image depth is estimated, which gives the distance of each image pixel in 3d space. the transformed image is taken from another camera position and orientation. new scn2img arguments to transform output of image, txt2img and img2img: Enable it with - transform3d : bool Only draw pixels where mask is in these bounds: - transform3d_min_mask: int - transform3d_max_mask: int Select depth estimation model and scale - transform3d_depth_model: int // 0 = monocular depth estimation, 1 = MiDaS depth estimation (default) - transform3d_depth_scale: float // camera position must reflect this scale Camera parameters of from where input image was taken - transform3d_from_hfov: float // horizontal field of view in degree, default 45 - transform3d_from_pose: int_tuple // x,y,z,rotatex,rotatey,rotatez in degree; default 0,0,0 , 0,0,0 Camera parameters to where we want to transform the image - transform3d_to_hfov: float - transform3d_to_pose: tuple Invert mask of unknown pixels due to 3d transformation - transform3d_mask_invert: bool // default is False Enable inpainting with cv2.inpaint, by default it is enabled. - transform3d_inpaint: bool - transform3d_inpaint_radius: int - transform3d_inpaint_method: int // 0=cv2.INPAINT_TELEA, 1=cv2.INPAINT_NS - transform3d_inpaint_restore_mask: bool // restore mask of unknown pixels due to transformation after inpainting

# Conflicts: # environment.yaml

new scn2img argument: `blend` `blend` is one of alpha, mask, add, add_modulo, darker, difference, lighter, logical_and, logical_or, logical_xor, multiply, soft_light, hard_light, overlay, screen, subtract, subtract_modulo alpha & mask are both the default alpha compositing the other ones are blend operations from PIL.ImageChops

xaedes · 2022-09-26T01:31:44Z

@codedealer I added output info and batch view progress.

@hlky I moved the code, pinned the dependency versions and reverted changes to not scn2img-related code (save_sample). For the remaining changes, i.e. the added default values for img2img and txt2img, I think my #1179 (comment) explains the reasoning.

@Dekker3D
I also added MiDaS based depth estimation and 3D image transformation. Not perfect, but actually usable.

Images can now be blended in other modes (full list) like multiply or screen. I used it mainly for better mask selection.

add new scn2img argument: - transform3d_depth_near: float // specifies how near the closest pixel is, i.e. the distance corresponding to depth=0 without this pixels with very low depth previously where strongly distorted after transformation.

codedealer

Couple points:

@hlky this currently pulls pickles from github for MiDaS into pytorch's cache. Should we pack them inside repo instead?
@xaedes During intermediate renders you call webui's save_sample with skip_metadata set to False hardcoded. In your example with the waterfall scene when saving empty masks this results in a warning thrown in console:

Couldn't find metadata on image

It's not critical but can we do something about it?

* added front matter * fixed links * logo is removed until further consideration

hlky · 2022-10-02T10:25:13Z

Note the approved files. I haven't tested this yet, but I have resolved the conflicts in requirements.txt and environment.yaml, there shouldn't be any issues with the accompanying data files.

@codedealer @xaedes what is the status of this?

xaedes · 2022-10-02T16:22:15Z

@hlky Tested it and it does work with the updated environments.yaml. requirements.txt also looks good, but at my machine I can't test if it works with Docker.

@codedealer I will do something about the image metadata.

@codedealer

Following metadata is written: - "prompt" contains the representation of the SceneObject corresponding to the intermediate image - "seed" contains the seed at the start of the function that generated this intermediate image - "width" and "height" contain the size of the image. To get the seed at the start of the render function without using it, a class SeedGenerator is added and used instead of the python generator functions. Fixes warning thrown in console: "> Couldn't find metadata on image", originally reported by @codedealer in Sygil-Dev#1179 (review)

@codedealer

Following metadata is written: - "prompt" contains the representation of the SceneObject corresponding to the intermediate image - "seed" contains the seed at the start of the function that generated this intermediate image - "width" and "height" contain the size of the image. To get the seed at the start of the render function without using it, a class SeedGenerator is added and used instead of the python generator functions. Fixes warning thrown in console: "> Couldn't find metadata on image", originally reported by @codedealer in Sygil-Dev#1179 (review)

@codedealer

Following metadata is written: - "prompt" contains the representation of the SceneObject corresponding to the intermediate image - "seed" contains the seed at the start of the function that generated this intermediate image - "width" and "height" contain the size of the image. To get the seed at the start of the render function without using it, a class SeedGenerator is added and used instead of the python generator functions. Fixes warning thrown in console: "> Couldn't find metadata on image", originally reported by @codedealer in Sygil-Dev#1179 (review)

@codedealer

Following metadata is written: - "prompt" contains the representation of the SceneObject corresponding to the intermediate image - "seed" contains the seed at the start of the function that generated this intermediate image - "width" and "height" contain the size of the image. To get the seed at the start of the render function without using it, a class SeedGenerator is added and used instead of the python generator functions. Fixes warning thrown in console: "> Couldn't find metadata on image", originally reported by @codedealer in Sygil-Dev#1179 (review)

@codedealer

# Description Intermediate image saving in scn2img tries to save metadata which is not set. This results in warning thrown in console: "Couldn't find metadata on image", originally reported by @codedealer in #1179 (review) Metadata for intermediate images is added to fix the warning. Following metadata is written: - "prompt" contains the representation of the SceneObject corresponding to the intermediate image - "seed" contains the seed at the start of the function that generated this intermediate image - "width" and "height" contain the size of the image. To get the seed at the start of the render function without using it, a class SeedGenerator is added and used instead of the python generator functions. Fixes warning thrown in console: "> Couldn't find metadata on image", originally reported by @codedealer in #1179 (review) # Checklist: - [x] I have changed the base branch to `dev` - [x] I have performed a self-review of my own code - [x] I have commented my code in hard-to-understand areas - [x] I have made corresponding changes to the documentation

xaedes added 30 commits September 10, 2022 16:14

fix indentation

119fc26

fix indentation

3610aa9

add scn2img markdown parser and first draft of scene rendering

9e28259

implement image compositing in scn2img

517c758

implement img2img and txtimg calls in scn2img

796843d

Merge remote-tracking branch 'upstream/dev' into scn2img

08c23c0

# Conflicts: # scripts/webui.py

remove debug outputs in scn2img

ee49912

use seed in scn2img and add scn2img toggle to output intermediate images

70942a4

add depth bumping headers to scn2img syntax

bd7fb50

this allows deep trees while avoiding growing number of '#' by just using '#> example title' headings. sections after this will have their depth bumped by number matched '>'.

fix scn2img bug when using render_img2img with color but no specified…

c027d8e

… size it uses default size from img2img_defaults

add scn2img cache to allow faster iteration

4a5349d

cache is stored in ram not in vram. it will get automatically cleared when generating with different seed in frontend.

add scn2img placeholder with nice result on initial seed 2

95a479b

add to scn2img syntax: bumpdown depth headings

d806da0

headings in format "#< title" will reduce following section depth by number of "<" useful to get out of high depth bump due to usage of "#> title". bug fix in scn2img function argument definitions

scn2img update:

7a037c1

- fix missing intermediates output - variation=0 will not modify seed - improved caching by using hash function with explicit contents instead of relying on str(obj) as hash - move function kwarg preparation out of render functions into seperate functions

improve scn2img caching

04ec89e

`None` attribute key is excluded from hash because it only contains free text (like newlines) of the corresponding scn2img-prompt section

add scn2img logger functions that log to comments and use them instea…

2f55e55

…d of print

add scn2img log levels and print logs depending on currently set log …

a12e771

…level to console

Merge remote-tracking branch 'upstream/dev' into scn2img-mask-by-esti…

6bfd72c

…mated-depth

add scn2img mask selection by colors picked from pixels in image and …

38c3c7d

…morphological operations new arguments: - mask_by_color_at - mask_open - mask_close - mask_grow - mask_shrink - mask_blur

fix scn2img bug when generating with empty prompt

973c0d3

add scn2img examples and a bit of documentation

dc4fcd0

Merge branch 'dev-upstream' into scn2img-mask-by-estimated-depth

4cd1e51

revert changes from commit 119fc26 "fix indentation" to make pull req…

3171f2f

…uest easier

revert changes from commit 119fc26 "fix indentation" to make pull req…

955a07c

…uest easier

xaedes added 10 commits September 24, 2022 20:28

specify version of dependencies

13356f2

tensorflow is for depth from image estimation, used for automatic mask selection in scn2img. tensorflow needs einops>=0.3.0 otherwise, "AttributeError: module 'keras.backend' has no attribute 'is_tensor'" is thrown when loading keras model.

Merge branch 'dev-upstream' into scn2img-mask-by-estimated-depth

8d765a7

scn2img parameter copy without string literals for js code.

40dccc1

now uses the same code for parameter copy as txt2img and scn2img.

reverted unnecessary changes

77f5185

Merge branch 'dev-upstream' into scn2img-mask-by-estimated-depth

c45e080

# Conflicts: # environment.yaml

add scn2img dependencies to requirements.txt

5ef0ade

xaedes requested review from codedealer and hlky and removed request for codedealer September 26, 2022 01:31

improve scn2img transform3d

18afa3a

add new scn2img argument: - transform3d_depth_near: float // specifies how near the closest pixel is, i.e. the distance corresponding to depth=0 without this pixels with very low depth previously where strongly distorted after transformation.

codedealer reviewed Sep 26, 2022

View reviewed changes

Breannajaeh referenced this pull request Oct 2, 2022

chore: update docs (#1351)

621b04b

* added front matter * fixed links * logo is removed until further consideration

hlky added 2 commits October 2, 2022 11:20

Merge branch 'dev' into scn2img-mask-by-estimated-depth

1f06eb9

Update environment.yaml

5e4b43c

hlky approved these changes Oct 2, 2022

View reviewed changes

Merge branch 'dev' into scn2img-mask-by-estimated-depth

2fecb6a

hlky merged commit 33b896d into Sygil-Dev:dev Oct 2, 2022

xaedes mentioned this pull request Oct 2, 2022

Add metadata to scn2img intermediate image output #1386

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scene-to-Image Prompt Layering System #1179

Scene-to-Image Prompt Layering System #1179

xaedes commented Sep 16, 2022

xaedes commented Sep 26, 2022

codedealer left a comment

hlky commented Oct 2, 2022

xaedes commented Oct 2, 2022

Scene-to-Image Prompt Layering System #1179

Scene-to-Image Prompt Layering System #1179

Conversation

xaedes commented Sep 16, 2022

Summary of the change

Description

Additional dependencies that are required for this change

Checklist:

xaedes commented Sep 26, 2022

codedealer left a comment

Choose a reason for hiding this comment

hlky commented Oct 2, 2022

xaedes commented Oct 2, 2022