On the bilinear in your implementation. #1

tianzhi0549 · 2020-03-26T05:47:16Z

CondInst/fcos/modeling/fcos/fcos_outputs.py

Line 366 in 4a519c1

    
           masks = interpolate(masks_per_image, size = (o_h,o_w), mode="bilinear", align_corners=False)

The default bilinear in PyTorch is not aligned, which might much degrade the performance, in particular for small objects.

Please try the aligned bilinear.

def aligned_bilinear(tensor, factor):
    assert tensor.dim() == 4
    assert factor >= 1
    assert int(factor) == factor

    if factor == 1:
        return tensor

    h, w = tensor.size()[2:]
    tensor = F.pad(tensor, pad=(0, 1, 0, 1), mode="replicate")
    oh = factor * h + 1
    ow = factor * w + 1
    tensor = F.interpolate(
        tensor, size=(oh, ow),
        mode='bilinear',
        align_corners=True
    )
    tensor = F.pad(
        tensor, pad=(factor // 2, 0, factor // 2, 0),
        mode="replicate"
    )

    return tensor[:, :, :oh - 1, :ow - 1]

The text was updated successfully, but these errors were encountered:

Epiphqny · 2020-03-29T02:40:18Z

@tianzhi0549 Thanks for pointing out it, i will try and update the new result later.

tianzhi0549 · 2020-03-29T04:17:18Z

@Epiphqny I also note that it seems you are using absolute coordinates as the input to the mask heads, which is not correct. It is important to use relative coordinates here because we hope the generated filters are position-independent.

Epiphqny · 2020-03-29T04:43:05Z

@tianzhi0549 The coordinates in this implementation is ranged from -1 to 1, what do you mean by relative coordinates, should it be 0-1 instead?

tianzhi0549 · 2020-03-29T04:58:51Z

@Epiphqny aim-uofa/AdelaiDet#10. You can refer to the explanation here.

Epiphqny · 2020-03-29T09:56:46Z

@tianzhi0549 Ok, i will try that.

Epiphqny · 2020-03-30T12:48:53Z

@tianzhi0549 It sounds that the relative coordinates is in some way like the center-ness...but implements in different approach, just my opinion.

tianzhi0549 · 2020-03-30T13:03:58Z

@tianzhi0549 They may be similar in some aspects, but they are designed for totally different purposes ...

Epiphqny · 2020-03-30T13:21:39Z

@tianzhi0549 Yes, both are interesting ideas!

Epiphqny · 2020-04-03T02:31:35Z

@tianzhi0549 Hi, i replaced the original upsample with the aligned version and used the upsampled mask to calculate loss, now the AP is 37.1. But this is still the absolute coordinate version, i will update new results after the training of relative coordinate version finished.

tianzhi0549 · 2020-04-03T03:22:30Z

@Epiphqny Great! For the memory usage issue, you could limit the maximum number of samples used to compute masks during training. Using relative coordinates might also much boost the performance.

Epiphqny · 2020-04-07T02:05:08Z

@tianzhi0549 Perhaps there is some problem in my implementation of relative coordinates, it only achieves 36.9 mAP, which is worse than the absolute coordinate version.

tianzhi0549 · 2020-04-07T02:27:33Z

@Epiphqny if possible, you can push your code to a new branch of the repo. I can help check it.

Epiphqny · 2020-04-07T05:07:20Z

Hi @tianzhi0549, i have add the code in the relative_coordinate branch, thank you very much for the help!

tianzhi0549 · 2020-04-07T08:43:00Z

@Epiphqny Are you sure this line is correct?

CondInst/fcos/modeling/fcos/fcos_outputs.py

Line 591 in 1b03b70

x_range = torch.linspace(-1, 1, w, device=self.masks.device)

Yuxin-CV · 2020-04-08T03:49:42Z

@Epiphqny Hi~Thanks for sharing your code!
It seems that the setting of IMS_PER_BATCH and BASE_LR in your config is incorrect.

CondInst/configs/CondInst/Base-FCOS.yaml

Lines 17 to 18 in ea3f717

    
           IMS_PER_BATCH: 4 
        
           BASE_LR: 0.01  # Note that RetinaNet uses a different default learning rate

IMS_PER_BATCH and BASE_LR should be changed together according to Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., IMS_PER_BATCH = 4 & BASE_LR = 0.0025.

I also find the similar problem in your YOLACT_FCOS repo:
https://github.com/Epiphqny/Yolact_fcos/blob/b131542a930499523343d3fd660088e7e372c317/configs/Yolact/Base-FCOS.yaml#L16-L18

Though changing IMS_PER_BATCH and BASE_LR according to Linear Scaling Rule cannot guarantee to reproduce the results in the paper, but I think it can help you obtain very close result. @tianzhi0549 @Epiphqny

Epiphqny · 2020-04-08T03:58:56Z

@Yuxin-CV Thank you very much for pointing out that, i will try the Linear Scaling Rule later.

Epiphqny · 2020-04-08T04:02:04Z

@tianzhi0549 sorry, can not find the problem in this line, could you point out it directly?

tianzhi0549 · 2020-04-08T04:24:17Z

@Epiphqny I would suggest that you compute all the coordinate transformation on the scale of the input image. After you get the final relative coordinates, you can normalize them by a constant scale. Please make sure even after normalization, the locations generating the filters should always be at (0, 0).

Epiphqny · 2020-04-08T04:37:49Z

@tianzhi0549 I have subtracted the center coordinate in

CondInst/fcos/modeling/fcos/fcos_outputs.py

Line 600 in 1b03b70

coords_feat = grid-offset_xy

, and the values of center locations are zero.

Yuxin-CV · 2020-04-08T05:49:03Z

@Yuxin-CV Thank you very much for pointing out that, i will try the Linear Scaling Rule later.

Personally, I think you can try R-50 1x lr_schedule with input_size = 800, batch_size = 16 first before using stronger backbone and longer lr_schedule. You can get the results in less than 1 day if you have access to 4 or 8 GPU.
Looking forward to your result! @Epiphqny

Yuxin-CV · 2020-04-08T06:01:29Z

BTW, I wonder how you @tianzhi0549 implement the forward_mask() part in the official code?
Do you simplely use a for loop just like @Epiphqny's implementation:

CondInst/fcos/modeling/fcos/fcos_outputs.py

Lines 585 to 607 in ea3f717

    
           # for each image 
        
           for i in range(N): 
        
               inds = (im_idxes==i).nonzero().flatten() 
        
               ins_num = inds.shape[0] 
        
               if ins_num > 0: 
        
                   controllers = controllers_pred[inds] 
        
                   mask_feat = masks_feat[None, i] 
        
                   weights1 = controllers[:, :80].reshape(-1,8,10).reshape(-1,10).unsqueeze(-1).unsqueeze(-1) 
        
                   bias1 = controllers[:, 80:88].flatten()             
        
                   weights2 = controllers[:, 88:152].reshape(-1,8,8).reshape(-1,8).unsqueeze(-1).unsqueeze(-1) 
        
                   bias2 = controllers[:, 152:160].flatten() 
        
                   weights3 = controllers[:, 160:168].unsqueeze(-1).unsqueeze(-1) 
        
                   bias3 = controllers[:,168:169].flatten() 
        
                   conv1 = F.conv2d(mask_feat,weights1,bias1).relu() 
        
                   conv2 = F.conv2d(conv1, weights2, bias2, groups = ins_num).relu() 
        
                   #masks_per_image = F.conv2d(conv2, weights3, bias3, groups = ins_num)[0].sigmoid() 
        
                   masks_per_image = F.conv2d(conv2, weights3, bias3, groups = ins_num)  
        
                   masks_per_image = aligned_bilinear(masks_per_image, self.strides[0])[0].sigmoid()          
        
                   for j in range(ins_num): 
        
                       ind = inds[j] 
        
                       mask_gt = masks_t[i][matched_idxes[ind]].float() 
        
                       mask_pred = masks_per_image[j] 
        
                       mask_loss += self.dice_loss(mask_pred, mask_gt)

or some other highly optimized implementation, e.g., a CUDA kernel?

Yuxin-CV · 2020-04-08T14:18:36Z

Hi~@Epiphqny
I also find that the mask_loss's normalization factor N_pos in your code is not reduced.

CondInst/fcos/modeling/fcos/fcos_outputs.py

Lines 581 to 582 in 4a519c1

    
           if batch_ins > 0: 
        
               mask_loss = mask_loss / batch_ins

I think it is better to use num_pos_avg as the normalization factor, which is the average of all the positive samples across different GPUs.

CondInst/fcos/modeling/fcos/fcos_outputs.py

Lines 504 to 508 in 4a519c1

    
           pos_inds = torch.nonzero(labels != num_classes).squeeze(1) 
        
           num_pos_local = pos_inds.numel() 
        
           num_gpus = get_world_size() 
        
           total_num_pos = reduce_sum(pos_inds.new_tensor([num_pos_local])).item() 
        
           num_pos_avg = max(total_num_pos / num_gpus, 1.0)

mask_loss = mask_loss / num_pos_avg

Yuxin-CV · 2020-04-09T02:30:51Z

@tianzhi0549 I have subtracted the center coordinate in

CondInst/fcos/modeling/fcos/fcos_outputs.py

Line 600 in 1b03b70

coords_feat = grid-offset_xy

, and the values of center locations are zero.

@Epiphqny I think the rel. coord. should be location specific, just like:

For location (x, y) on input_img:
    x_range = torch.arange(W_mask)
    y_range = torch.arange(H_mask)
    y_grid, x_grid = torch.grid(y_range, x_range)
    y_rel_coord = (y_grid – y / mask_stride).normalize_to(-1, 1)
    x_rel_coord = (x_grid – x / mask_stride).normalize_to(-1, 1)
    rel_coord = torch.cat(x_rel_coord, y_rel_coord)

@tianzhi0549 Am I right? Could you provide the official code snippet of rel. coord.? Thanks!

Epiphqny · 2020-04-09T02:37:33Z

@Yuxin-CV Please modify the code and train the model, then report the result here. I will update if there is improvement. I don't have extra GPU to train the model now.

Yuxin-CV · 2020-04-09T02:42:51Z

@Yuxin-CV Please modify the code and train the model, then report the result here. I will update if there is improvement. I don't have extra GPU to train the model now.

OK

tianzhi0549 · 2020-04-10T04:20:33Z

@Epiphqny For your information. aim-uofa/AdelaiDet#23 (comment). Thank you:-).

Epiphqny · 2020-04-10T06:25:57Z

@tianzhi0549 Ok, thanks for providing the code.

guangdongliang · 2021-04-09T04:38:33Z

@tianzhi0549 I got the same result in your docker using "aligned_bilinear" and "F.interpolate" !

chufengt · 2021-09-16T09:04:51Z

@tianzhi0549
One question about aligned_bilinear:
I noticed that other interpolation operations in detectron2 and adet required align_corners =False (e.g. image and mask resize).
Should we change other align_corners to True when using CondInst?
Thanks.

Epiphqny closed this as completed Apr 14, 2020

Yuxin-CV mentioned this issue Apr 24, 2020

Attempt to Reproduce the Results of CondInst. aim-uofa/AdelaiDet#39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On the bilinear in your implementation. #1

On the bilinear in your implementation. #1

tianzhi0549 commented Mar 26, 2020

Epiphqny commented Mar 29, 2020

tianzhi0549 commented Mar 29, 2020

Epiphqny commented Mar 29, 2020

tianzhi0549 commented Mar 29, 2020

Epiphqny commented Mar 29, 2020

Epiphqny commented Mar 30, 2020

tianzhi0549 commented Mar 30, 2020

Epiphqny commented Mar 30, 2020

Epiphqny commented Apr 3, 2020

tianzhi0549 commented Apr 3, 2020

Epiphqny commented Apr 7, 2020

tianzhi0549 commented Apr 7, 2020

Epiphqny commented Apr 7, 2020

tianzhi0549 commented Apr 7, 2020

Yuxin-CV commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

tianzhi0549 commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

Yuxin-CV commented Apr 8, 2020 •

edited

Loading

Yuxin-CV commented Apr 8, 2020

Yuxin-CV commented Apr 8, 2020 •

edited

Loading

Yuxin-CV commented Apr 9, 2020 •

edited

Loading

Epiphqny commented Apr 9, 2020 •

edited

Loading

Yuxin-CV commented Apr 9, 2020

tianzhi0549 commented Apr 10, 2020

Epiphqny commented Apr 10, 2020

guangdongliang commented Apr 9, 2021 •

edited

Loading

chufengt commented Sep 16, 2021

On the bilinear in your implementation. #1

On the bilinear in your implementation. #1

Comments

tianzhi0549 commented Mar 26, 2020

Epiphqny commented Mar 29, 2020

tianzhi0549 commented Mar 29, 2020

Epiphqny commented Mar 29, 2020

tianzhi0549 commented Mar 29, 2020

Epiphqny commented Mar 29, 2020

Epiphqny commented Mar 30, 2020

tianzhi0549 commented Mar 30, 2020

Epiphqny commented Mar 30, 2020

Epiphqny commented Apr 3, 2020

tianzhi0549 commented Apr 3, 2020

Epiphqny commented Apr 7, 2020

tianzhi0549 commented Apr 7, 2020

Epiphqny commented Apr 7, 2020

tianzhi0549 commented Apr 7, 2020

Yuxin-CV commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

tianzhi0549 commented Apr 8, 2020

Epiphqny commented Apr 8, 2020

Yuxin-CV commented Apr 8, 2020 • edited Loading

Yuxin-CV commented Apr 8, 2020

Yuxin-CV commented Apr 8, 2020 • edited Loading

Yuxin-CV commented Apr 9, 2020 • edited Loading

Epiphqny commented Apr 9, 2020 • edited Loading

Yuxin-CV commented Apr 9, 2020

tianzhi0549 commented Apr 10, 2020

Epiphqny commented Apr 10, 2020

guangdongliang commented Apr 9, 2021 • edited Loading

chufengt commented Sep 16, 2021

Yuxin-CV commented Apr 8, 2020 •

edited

Loading

Yuxin-CV commented Apr 8, 2020 •

edited

Loading

Yuxin-CV commented Apr 9, 2020 •

edited

Loading

Epiphqny commented Apr 9, 2020 •

edited

Loading

guangdongliang commented Apr 9, 2021 •

edited

Loading