Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you share the code that replicates the results of the MSNEA model? #3

Closed
sty945 opened this issue Apr 24, 2023 · 4 comments
Closed

Comments

@sty945
Copy link

sty945 commented Apr 24, 2023

Hi friends,

I can't reproduce the results of the MSNEA model. Could you share the code that replicates the results of the MSNEA model?

@dagshub
Copy link

dagshub bot commented Apr 24, 2023

@hackerchenzhuo
Copy link
Collaborator

hackerchenzhuo commented Apr 24, 2023

Hi friends,

I can't reproduce the results of the MSNEA model. Could you share the code that replicates the results of the MSNEA model?

Sure, we have used the source code of MSNEA, modified part of it to make it fit our settings, which involves not using the content of attribute values but only the attribute types themselves.

We only made simple modifications, but this may affect the original performance of the model. If you have a better way, we welcome your feedback. In addition, we would be glad to release the full code for all baseline comparisons later on, which is currently being organized.

The main changes are shown below :

class AttrEncoder(nn.Module):
    def __init__(self, kgs, args):
        super().__init__()
        self.args = args
        # Using the attribute multi-hot data for attr_embed initialization
        self.attr_embed = nn.Embedding.from_pretrained(torch.FloatTensor(kgs["att_features"]))
        self.fc1 = nn.Linear(kgs["att_features"].shape[1], self.args.dim)
        self.fc2 = nn.Linear(self.args.dim, self.args.dim)
        nn.init.xavier_normal_(self.fc1.weight.data)
        nn.init.xavier_normal_(self.fc2.weight.data)

    def forward(self, e_idx, e_i):
        e_a = self.fc1(self.attr_embed(e_idx))
        # e_v = torch.sigmoid(e_v.unsqueeze(-1)).repeat(1, 1, self.args.dim)
        # e = self.fc2(torch.cat([e_a, e_v], dim=2))
        # Vision-adaptive Attribute Learning
        # alpha = F.softmax(torch.sum(e_a * e_i.unsqueeze(1), dim=-1), dim=1)
        # e = torch.sum(alpha.unsqueeze(2) * e_a, dim=1)
        return e_a

@sty945
Copy link
Author

sty945 commented Apr 25, 2023

I am very pleased to receive your reply. I am looking forward to the code for all baseline comparisons. MSNEA claimed their image features are extracted by ResNet-50, but I failed to find ResNet-50 pre-trained model in their repository. How did you solve this problem?

@hackerchenzhuo
Copy link
Collaborator

hackerchenzhuo commented Apr 25, 2023

@sty945 Yeah, I don't have feature data for this part either, not even the image raw data. Therefore, we did not try to reproduce the results of the orig paper, but directly experimented on the features of VGG/ResNet from my dataset. To some extent, this is a fair comparison.

Moreover, we are currently conducting model training on CLIP features, and we will publish the content of this part within three months as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants