You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you first for sharing your great work for the community :)
According to your published paper, camera location embeddings (tau_{k}) are subtracted from map-view positional encodings (c^{n}) to make map-view queries (c^{n} - tau_{k}).
However, I found from your code that camera location embeddings (tau_{k}) are also subtracted from camera positional embeddings (delta_{k,i}), which is different from equation 3. Please see the last two lines of the following code.
# -------------------------
# translation embedding, tau_{k}
# -------------------------
c = E_inv[..., -1:] # b n 4 1
c_flat = rearrange(c, 'b n ... -> (b n) ...')[..., None] # (b n) 4 1 1
c_embed = self.cam_embed(c_flat) # (b n) d 1 1
# -------------------------
# R_{k}^{-1} X K_{k}^{-1} X x_{i}^{(I)}
# -------------------------
pixel_flat = rearrange(pixel, '... h w -> ... (h w)') # 1 1 3 (h w)
cam = I_inv @ pixel_flat # b n 3 (h w)
cam = F.pad(cam, (0, 0, 0, 1, 0, 0, 0, 0), value=1) # b n 4 (h w)
d = E_inv @ cam # b n 4 (h w)
d_flat = rearrange(d, 'b n d (h w) -> (b n) d h w', h=h, w=w) # (b n) 4 h w
d_embed = self.img_embed(d_flat)
# -------------------------
# Normalization for attention
# -------------------------
# TODO : why subtract c_embed?
img_embed = d_embed - c_embed # (b n) d h w
img_embed = img_embed / (img_embed.norm(dim=1, keepdim=True) + 1e-7) # (b n) d h w
Am I missing something?
The text was updated successfully, but these errors were encountered:
Thank you first for sharing your great work for the community :)
According to your published paper, camera location embeddings (tau_{k}) are subtracted from map-view positional encodings (c^{n}) to make map-view queries (c^{n} - tau_{k}).
However, I found from your code that camera location embeddings (tau_{k}) are also subtracted from camera positional embeddings (delta_{k,i}), which is different from equation 3. Please see the last two lines of the following code.
Am I missing something?
The text was updated successfully, but these errors were encountered: