Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Is it possible to implement LeRes model for a more detailed depth map? #11

Open
bugmelone opened this issue Nov 29, 2022 · 13 comments

Comments

@bugmelone
Copy link

bugmelone commented Nov 29, 2022

Just found this repo: https://github.com/compphoto/BoostingMonocularDepth
And I was very impressed with the level of detail of the LeRes model.
We can see that by comparing Midas vs LeRes results in complex scene image:
Image
Midas
LeRes

@Extraltodeus
Copy link
Owner

wow I didn't know about that repository. Gotta take a look. Thanks!

@Maxxxwell62
Copy link

Much more accurate than midas, I hope they look more closely at this one.

@bugmelone
Copy link
Author

Also you might want to take a look at this neighboring repo:
https://github.com/compphoto/BoostYourOwnDepth
it looks like we can combine results from Midas and LeRes

@Maxxxwell62
Copy link

Maxxxwell62 commented Nov 29, 2022

Also you might want to take a look at this neighboring repo: https://github.com/compphoto/BoostYourOwnDepth it looks like we can combine results from Midas and LeRes

Place the pulls request here as well, maybe it will be faster for them to look

https://github.com/TheLastBen/fast-stable-diffusion/
https://github.com/AUTOMATIC1111/stable-diffusion-webui/
https://github.com/compphoto/BoostingMonocularDepth/

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Nov 30, 2022

I second this. The depth extraction repo I was working with, which was programmed by @donlinglok and which now works locally on windows actually use that exact system if I'm not mistaken. It's based on a second round of depth-extraction and some combination process after the first pass is completed. The results are really more detailed and they work much better if you want to use your depthmap to create 3d models from your scene.

Have a look at this repo
https://github.com/donlinglok/3d-photo-inpainting

And most particularly you may want to compare this part: donlinglok1/3d-photo-inpainting@87ffa05

It was a pleasure helping him fix the problems that were preventing this from running on Windows, and I was so proud when we actually got it right (even though @Donlinglok did all the work!). It would be amazing if it could be adapted for this extension as well.

EDIT: This was all based on a request to add this as an extension for Automatic1111 over here - it documents the debugging process of the windows port of this famous 3dboost among other things.
AUTOMATIC1111/stable-diffusion-webui#4268

One more EDIT: LeRes is now fully functional with this version of the Repo : https://github.com/donlinglok/3d-photo-inpainting/tree/LeRes
You might need to manually download the LeRes model from this link: https://cloudstor.aarnet.edu.au/plus/s/lTIJF4vrvHCAI31/download
And it goes here : /BoostingMonocularDepth/res101.pth

@donlinglok1
Copy link

donlinglok1 commented Nov 30, 2022

I second this. The depth extraction repo I was working with, which was programmed by @donlinglok and which now works locally on windows actually use that exact system if I'm not mistaken. It's based on a second round of depth-extraction and some combination process after the first pass is completed.
...
Have a look at this repo https://github.com/donlinglok/3d-photo-inpainting
...

@AugmentedRealityCat WOW, the LeRes result looks good!
The vt-vl-lab/3d-photo-inpainting we tried on windows did generate a depth image...
but since vt-vl-lab/3d-photo-inpainting is a 2021 project build with MiDas. I am not sure which one is better, maybe i will try on the LeRes later.

@AugmentedRealityCat
Copy link

but since it is a 2021 project I am not sure which one is better, maybe i will try on the LeRes later.

Please try it, and let me know if you need my help with anything.

If I understand correctly, that would mean the "Boosting Monocular Depth" we were using was not even the latest version since LeRes was recently added to it in the repo @bugmelone has posted at the top of this thread.

This is very promising! I can't wait to try this.

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Nov 30, 2022

It WORKS (EDIT: It reall works now) !

I installed in a brand new folder, but I did reuse my old venv.
The only things I had to manually fix was to replace the empty 0.00 KB model in 3d-photo-inpainting\BoostingMonocularDepth\midas\model.pt (just like the last time) and to remove the pictures and the videos that were already in there.

Is there any way to make sure that the new LeRes algorithm has been applied ?

One thing is for sure, is that I got all the files I was getting with the previous version: a .npy file (some python numerical data - probably depth information) and a smaller resolution depthmap in png format (from 3d-photo-inpainting\depth ), a larger full resolution depthmap in png format (from 3d-photo-inpainting\BoostingMonocularDepth\outputs ), a .ply mesh file (from 3d-photo-inpainting\mesh ), and four video files in mp4 format (from 3d-photo-inpainting\KenBurns\Output ).

The other thing it produced is some kind of log in a file called test_opt.txt in 3d-photo-inpainting\BoostingMonocularDepth\pix2pix\checkpoints\void. Here is what it contains in case it could be useful:

  ----------------- Options ---------------
                    Final: True                          	[default: False]
                       R0: False                         
                      R20: False                         
             aspect_ratio: 1.0                           
               batch_size: 1                             
          checkpoints_dir: ./pix2pix/checkpoints         
         colorize_results: False                         
                crop_size: 672                           
                 data_dir: inputs/                       	[default: None]
                 dataroot: None                          
             dataset_mode: depthmerge                    
                 depthNet: 0                             	[default: None]
                direction: AtoB                          
          display_winsize: 256                           
                    epoch: latest                        
                     eval: False                         
            generatevideo: None                          
                  gpu_ids: 0                             
                init_gain: 0.02                          
                init_type: normal                        
                 input_nc: 2                             
                  isTrain: False                         	[default: None]
                load_iter: 0                             	[default: 0]
                load_size: 672                           
         max_dataset_size: 10000                         
                  max_res: inf                           
                    model: pix2pix4depth                 
               n_layers_D: 3                             
                     name: void                          
                      ndf: 64                            
                     netD: basic                         
                     netG: unet_1024                     
 net_receptive_field_size: None                          
                      ngf: 64                            
               no_dropout: False                         
                  no_flip: False                         
                     norm: none                          
                 num_test: 50                            
              num_threads: 4                             
               output_dir: outputs                       	[default: None]
                output_nc: 1                             
        output_resolution: None                          
                    phase: test                          
              pix2pixsize: None                          
               preprocess: resize_and_crop               
                savecrops: None                          
             savewholeest: None                          
           serial_batches: False                         
                   suffix:                               
                  verbose: False                         
----------------- End -------------------

Thanks again for making this happen.

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Nov 30, 2022

And here is the depthmap I get from the very latest version - everything works now !

v-sso_fourth (2)

@Extraltodeus
Copy link
Owner

Impressive @AugmentedRealityCat ! Could you make me a pull request with your modifications?

@AugmentedRealityCat
Copy link

Could you make me a pull request with your modifications?

I wish I could, but I am not a programmer, so I have no idea how to do that. The programmer is @donlinglok .

@donlinglok1
Copy link

donlinglok1 commented Dec 2, 2022

@Extraltodeus
The video that @AugmentedRealityCat generated is by
3d-photo-inpainting, that project already implemented https://github.com/compphoto/BoostingMonocularDepth, I just did a small change to switch the method form MiDas to LeRes. (which BoostingMonocularDepth supported.)

I think it is easy to implement the BoostingMonocularDepth, you may check on this file boostmonodepth_utils.py line 23-49, it just collects the image and pass to BoostingMonocularDepth via command line, and invert grayscale of the output.

more reference here:
AUTOMATIC1111/stable-diffusion-webui#4268

@bugmelone
Copy link
Author

bugmelone commented Dec 3, 2022

Looks like @thygate added LeRes to his main extension repo as experimental support.
thygate/stable-diffusion-webui-depthmap-script#34 (comment)
thygate/stable-diffusion-webui-depthmap-script@ee06f97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants