Skip to content

Commit

Permalink
Scene-to-Image Prompt Layering System (#1179)
Browse files Browse the repository at this point in the history
# Summary of the change

- new Scene-to-Image tab
- new scn2img function
- functions for loading and running monocular_depth_estimation with
tensorflow

# Description

(relevant motivation, which issue is fixed)

Related to discussion #925

> Would it be possible to have a layers system where we could do have
foreground, mid, and background objects which relate to one another and
share the style? So we could say generate a landscape, one another layer
generate a castle, and on another layer generate a crowd of people.

To make this work I made a prompt-based layering system in a new
"Scene-to-Image" tab.
You write a a multi-line prompt that looks like markdown, where each
section declares one layer.
It is hierarchical, so each layer can have their own child layers.

Examples: https://imgur.com/a/eUxd5qn
![](https://i.imgur.com/L61w00Q.png)

In the frontend you can find a brief documentation for the syntax,
examples and reference for the various arguments.

Here a short summary:

Sections with "prompt" and child layers are img2img, without child
layers they are txt2img.
Without "prompt" they are just images, useful for mask selection, image
composition, etc.
Images can be initialized with "color", resized with "resize" and their
position specified with "pos".
Rotation and rotation center are "rotation" and "center". 

Mask can automatically be selected by color or by estimated depth based
on https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter.

![](https://i.imgur.com/8rMHWmZ.png)

# Additional dependencies that are required for this change

For mask selection by monocular depth estimation tensorflow is required
and the model must be cloned to ./src/monocular_depth_estimation/
Changes in environment.yaml:
- einops>=0.3.0
- tensorflow>=2.10.0 

Einops must be allowed to be newer for tensorflow to work.

# Checklist:

- [x] I have changed the base branch to `dev`
- [x] I have performed a self-review of my own code
- [x] I have commented my code in hard-to-understand areas
- [x] I have made corresponding changes to the documentation

Co-authored-by: hlky <106811348+hlky@users.noreply.github.com>
  • Loading branch information
xaedes and hlky authored Oct 2, 2022
1 parent 5853f3e commit 33b896d
Show file tree
Hide file tree
Showing 10 changed files with 2,100 additions and 12 deletions.
40 changes: 40 additions & 0 deletions data/scn2img_examples/cat_at_beach.scn2img.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// blend it together and finish it with details
prompt: cute happy orange cat sitting at beach, beach in background, trending on artstation:1 cute happy cat:1
sampler_name:k_euler_a
ddim_steps: 35
denoising_strength: 0.55
variation: 3
initial_seed: 1

# put foreground onto background
size: 512, 512
color: 0,0,0

## create foreground
size:512,512
color:0,0,0,0
resize: 300, 300
pos: 256, 350

// select mask by probing some pixels from the image
mask_by_color_at: 15, 15, 15, 256, 85, 465, 100, 480
mask_by_color_threshold:80
mask_by_color_space: HLS

// some pixels inside the cat may be selected, remove them with mask_open
mask_open: 15

// there is still some background pixels left at the edge between cat and background
// grow the mask to get them as well
mask_grow: 15

// we want to remove whatever is masked:
mask_invert: True

####
prompt: cute happy orange cat, white background
ddim_steps: 25
variation: 1

## create background
prompt:beach landscape, beach with ocean in background, photographic, beautiful:1 red:-0.4
50 changes: 50 additions & 0 deletions data/scn2img_examples/corgi_3d_transformation.scn2img.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
initial_seed: 2

// select background and img2img over it
mask_by_color_at: 64, 64
mask_invert: True

prompt: corgi
ddim_steps: 50
seed: 242886303

mask_mode: 0
denoising_strength: 0.8
//cfg_scale: 15
mask_restore: True
image_editor_mode:Mask

# estimate depth and transform the corgi in 3d
transform3d: True
transform3d_depth_near: 0.5
transform3d_depth_scale: 10
transform3d_from_hfov: 45
transform3d_to_hfov: 45
transform3d_from_pose: 0,0,0, 0,0,0
transform3d_to_pose: 0.5,0,0, 0,-5,0
transform3d_min_mask: 0
transform3d_max_mask: 255
transform3d_inpaint_radius: 1
transform3d_inpaint_method: 0

## put foreground onto background
size: 512, 512


### create foreground
size: 512, 512

mask_depth: True
mask_depth_model: 1
mask_depth_min: -0.05
mask_depth_max: 0.5
mask_depth_invert:False

####
prompt: corgi
ddim_steps: 25
seed: 242886303

### background
size: 512,512
color: #9F978D
25 changes: 25 additions & 0 deletions data/scn2img_examples/corgi_at_beach.scn2img.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// blend it together and finish it with some details
prompt: cute corgi at beach, trending on artstation
ddim_steps: 50
denoising_strength: 0.5
initial_seed: 2

# put foreground onto background
size: 512, 512

## create foreground
size: 512, 512

// estimate depth from image and select mask by depth
// https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter
mask_depth: True
mask_depth_min: -0.05
mask_depth_max: 0.4
mask_depth_invert:False

###
prompt: corgi
ddim_steps: 25

## create background
prompt:beach landscape, beach with ocean in background, photographic, beautiful:1 red:-0.4
34 changes: 34 additions & 0 deletions data/scn2img_examples/corgi_at_beach_2.scn2img.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// give it some polish and details
size: 512, 512
prompt: cute corgi at beach, intricate details, photorealistic, trending on artstation
variation: 0
seed: 1360051694
initial_seed: 5

# blend it together
prompt: beautiful corgi:1.5 cute corgi at beach, trending on artstation:1 photorealistic:1.5
ddim_steps: 50
denoising_strength: 0.5
variation: 0

## put foreground in front of background
size: 512, 512

### select foreground
size: 512, 512

// estimate depth from image and select mask by depth
// https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter
mask_depth: True
mask_depth_min: -0.05
mask_depth_max: 0.4
mask_depth_invert:False

#### create foreground
prompt: corgi
ddim_steps: 25
seed: 242886303

### create background
prompt:beach landscape, beach with ocean in background, photographic, beautiful:1 red:-0.4
variation: 3
37 changes: 37 additions & 0 deletions data/scn2img_examples/landscape_3d.scn2img.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
size: 512,512
mask_blur: 6

prompt: fantasy landscape with castle and forest and waterfalls, trending on artstation
denoising_strength: 0.6
seed: 1
image_editor_mode: Mask
mask_mode: 0
mask_restore: True

# mask the left which contains artifacts
color: 255,255,255,0
blend:multiply
size: 100,512
pos: 50,256

# mask the top-left which contains lots of artifacts
color: 255,255,255,0
blend:multiply
size: 280,128
pos: 128,64

# go forward and turn head left to look at the left waterfalls
transform3d: True
transform3d_depth_scale: 10000
transform3d_from_hfov: 60
transform3d_to_hfov: 60
transform3d_from_pose: 0,0,0, 0,0,0
transform3d_to_pose: 4000,0,2000, 0,-50,0
transform3d_min_mask: 0
transform3d_max_mask: 255
transform3d_inpaint_radius: 5
transform3d_inpaint_method: 1

##
prompt: fantasy landscape with castle and forest and waterfalls, trending on artstation
seed: 1
7 changes: 4 additions & 3 deletions environment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ dependencies:
- albumentations==0.4.3
- basicsr>=1.3.4.0
- diffusers==0.3.0
- einops==0.3.0
- einops==0.3.1
- facexlib>=0.2.3
- ftfy==6.1.1
- fairscale==0.4.4
Expand Down Expand Up @@ -72,9 +72,10 @@ dependencies:
- streamlit-tensorboard==0.0.2
- test-tube>=0.7.5
- tensorboard==2.10.1
- timm==0.4.12
- timm==0.6.7
- torch-fidelity==0.3.0
- torchmetrics==0.6.0
- transformers==4.19.2
- tensorflow==2.10.0
- tqdm==4.64.0

Loading

0 comments on commit 33b896d

Please sign in to comment.