-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use custom camera model and camera parameters using onnx inference #103
Comments
@Owen-Liuyuxuan does onnx support a custom camera? |
@alpereninci cc: @YvanYin In ros2_vision_inferece, I make a demonstration of how to put camera matrix I trimmed down the projection and coordinate transform codes from ## Change the model export script
class Metric3DExportModel(torch.nn.Module):
def __init__(self, meta_arch, is_export_rgb=True):
super().__init__()
self.meta_arch = meta_arch
self.register_buffer('rgb_mean', torch.tensor([123.675, 116.28, 103.53]).view(1, 3, 1, 1).cuda())
self.register_buffer('rgb_std', torch.tensor([58.395, 57.12, 57.375]).view(1, 3, 1, 1).cuda())
self.input_size = (616, 1064)
def normalize_image(self, image):
image = image - self.rgb_mean
image = image / self.rgb_std
return image
def forward(self, image, P):
original_image = image.clone()
image = self.normalize_image(image)
with torch.no_grad():
pred_depth, confidence, output_dict = self.meta_arch.inference({'input': image})
canonical_to_real_scale = (P[:, 0, 0, None, None] + P[:, 1, 1, None, None] ) / 2*1000.0 # 1000.0 is the focal length of canonical camera
print(canonical_to_real_scale.shape, pred_depth.shape)
pred_depth = pred_depth * canonical_to_real_scale # now the depth is metric
return pred_depth
## In testing
dummy_P = np.zeros([1, 3, 4], dtype=np.float32)
outputs = ort_session.run(None, {"image": dummy_image, "P": dummy_P}) Any reshape operations before getting the (616, 1064) should be accompanied by changes in the camera matrix But if you are considering changing the input size to the network, I have not succeeded in doing so myself, I am afraid there could be errors inside the ViT network? @YvanYin Any ideas on changing the input to the network? |
Thanks for your reply @Owen-Liuyuxuan. Actually, I am considering changing the input size to network. |
By slightly changing the input shape to the network, the onnx model works (at least with no weird errors). However, I am not sure about the generalization ability and the metric accuracy. I will give it a test. |
I have tried on personal data. It works but the canonical camera focal length is not necessarily 500. I believe you could try to fine-tune the parameters for your usage.
BTW, for TensorRT usage, we need to clean up the cache every time before constant parameter changes can apply, so I suggest doing tuning in GPU first. |
I have a custom camera with intrinsic calibration. I have problem with onnx inference. Should I give "cam_model" parameter to model or Is using post process enough?
I think post process is not complete in test_onnx.py.
I would like to inference with "metric3d_vit_small" model.
I see in "do_test.py"
in vit.raft5.small.py config file
During onnx inference should I use canonical_space['focal_length'] = 1000 (comes from config) and normalize scale = 1 ( (comes from config)).
How to use cx and cy? Are there important parameters?
Also, what should I do if I change input resolution? Given input resolution is (H,W) = (616, 1064). What would happen and what should I do if I downsample the image resolution to (308, 532)
The text was updated successfully, but these errors were encountered: