You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for providing this code!
I have successfully run the script to prepare the data for Scannet, however when attempting to run the training, I am sadly running into a segfault.
The console output before crash:
keyname=instance_normal_augment_2 task=train started
the number of images val 20
the number of images train 1201
the number of images 1201
Through some print statement abuse, I've managed to see that the code seems to be breaking in
function forward( self, coords, faces, colors, instances), file models/instance.py, at line 199
Python, gcc, torch, cuda versions:
Python - 3.7.2
torch - 1.0.0
cuda - 9.0.176
I am attempting to run the code on a system with Tesla K40c, with 12GB of memory
I'd greatly appreciate help in trying to figure out what is going wrong.
Thanks!
The text was updated successfully, but these errors were encountered:
Could you please check the value range of all_coords (all_coords.min(0) and all_coords.max(0)). The all_coords should have a shape of Nx4 and all_coords.min(0)[:3] should be greater than 0, all_coords.max(0)[:3] should be smaller than 4096 and all_coords.min(0)[3] = all_coords.max(0)[3] = 0.
Turns out the fault has been a bit on my side - I think the issue has been due to the python version mismatch. SparseConvNet github page mentions the use of python 3.6.8, so I've switched to that version of python. Additionally, I've noticed mismatch between nvcc version and cuda version in torch on my computer.
After these two changes, the network seems to be training without issues.
I think it would be nice if README.md mentioned the required CUDA/python versions, as without SparseConvNet page I'd be lost.
Anyway, thanks again for the help and I will close the issue.
Hi,
Thank you for providing this code!
I have successfully run the script to prepare the data for Scannet, however when attempting to run the training, I am sadly running into a segfault.
The console output before crash:
Through some print statement abuse, I've managed to see that the code seems to be breaking in
function forward( self, coords, faces, colors, instances), file models/instance.py, at line 199
Python, gcc, torch, cuda versions:
Python - 3.7.2
torch - 1.0.0
cuda - 9.0.176
I am attempting to run the code on a system with Tesla K40c, with 12GB of memory
I'd greatly appreciate help in trying to figure out what is going wrong.
Thanks!
The text was updated successfully, but these errors were encountered: