-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questioning Inference Speed #4
Comments
Okay, I seem to have found something that might work to eagerly perform the model inference: If you replace |
This is somehow a problem I wasn't aware of before as I followed the implementation of https://github.com/fangchangma/self-supervised-depth-completion. I think your intuition is right and I will look into it. |
Okay, thank you. Let us know how it goes. 😀 In any case, the model performance is still state-of-the-art. My current research direction is actually in running fast depth completion on the edge, which is why I took an interest in your paper. My next experiments will be in trying to minify your network and reduce parameters to run it faster 🙂 |
Proper inference time is reported now at the page of this project. Thanks for your pointing out this problem! |
Got it, thank you for updating! |
Good day,
First of all, congratulations on your work and paper. The idea of separating depth-dominant and color-dominant branches is interesting. Also, thank you for releasing the source code to the public. I have been replicating your code the past few days, and so far inferencing has been straightforward (I am getting RMSE scores at around ~760).
However, correct me if I'm wrong but I think there might be a mistake in the inference time computation. In
main.py
line 213/216, this is where the predictions are generated from the ENet/PENet models, after whichgpu_time
is computed. I tried adding aprint(pred)
function call (see in the image below).I got very different inference times with and without the
print(pred)
function call. I ran this on a machine with RTX 2080Ti, i7-9700k, CUDA 11.2, torch==1.3.1, torchvision==0.4.2. Below are my runtimes:original code - a bit faster than your official runtime presumably due to my newer CUDA version(?)
modified code - much slower when
print(pred)
was addedMy understanding is that calling
pred = model(batch_data)
does not yet run the model prediction; the model inference only actually runs when you callresult.evaluate()
in line 268 (i.e. lazy execution):This results in a nearly x10 increase in inference time (i.e. 151ms vs 17ms). Can you confirm that this also happens in your environment?
The text was updated successfully, but these errors were encountered: