Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NN] Refactor the Code Structure of GT #5100

Merged
merged 15 commits into from
Apr 14, 2023
Merged

Conversation

rudongyu
Copy link
Collaborator

@rudongyu rudongyu commented Jan 4, 2023

Description

  1. Refactor the code structure of graph transformer utilities.
  2. Fix the attention nan value pollution issue caused by all -inf values in a row of attn_bias (Padded Tensor should not be full '-inf' In Spatial Encoder for Graphormer #5476)
    • attn_bias by SpatialEncoder or PathEncoder will be padded with zeros
    • add a note for attn_mask in BiasedMHA to avoid all positions in a row masked
  3. Fix scaling factor bug in BiasedMHA: change / self.scaling to * self.scaling

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • I've leverage the tools to beautify the python and c++ code.
  • The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 4, 2023

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 4, 2023

Commit ID: c1acd25

Build ID: 1

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 4, 2023

Commit ID: 4901d46

Build ID: 2

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@rudongyu rudongyu changed the title [Refactor] Refactor the Code Structure of GT [NN] Refactor the Code Structure of GT Jan 4, 2023
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jan 30, 2023

Commit ID: 2e675d1905e0f025d04c10f823e13d8b1b03b6c5

Build ID: 3

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2023

Commit ID: f8243b0

Build ID: 4

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 14, 2023

Commit ID: 5c3584a

Build ID: 5

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@rudongyu rudongyu marked this pull request as ready for review March 15, 2023 00:55
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 15, 2023

Commit ID: cba8ab7cdfa4b5749aa64af489fc12cd26ed0c58

Build ID: 6

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@rudongyu rudongyu requested review from mufeili and frozenbugs March 29, 2023 06:04
@mufeili
Copy link
Member

mufeili commented Mar 29, 2023

This PR is quite large. Is it possible to break it into multiple smaller PRs?

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 29, 2023

Commit ID: 9561b9684165f1a1900f85f901ca1a9ce8d3ebf7

Build ID: 7

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

@rudongyu
Copy link
Collaborator Author

This PR is quite large. Is it possible to break it into multiple smaller PRs?

Most modifications are just movement of existing utilities for better code structure.
Main changes include:

@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 30, 2023

Commit ID: ef59998

Build ID: 8

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

-------
>>> import dgl
>>> from dgl import LaplacianPE
>>> from dgl.nn import LapPosEncoder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you want to break the consistency of using Laplacian for PE and PosEncoder?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My consideration here is to keep consistency with SpatialEncoder, DegreeEncoder, etc. However, LaplacianPosEncoder seems too long. Any suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can rename LaplacianPE to LapPE. To maintain backward compatibility, consider having an alias for now and adding a deprecation warning.

self.pe_encoder = nn.TransformerEncoder(
encoder_layer, num_layers=num_layer
)
elif self.model_type == "DeepSet":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from SetTransformerEncoder here?

Copy link
Member

@mufeili mufeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a pass. For a sanity check, did you try the refactored code on an end-to-end example?

Copy link
Collaborator

@frozenbugs frozenbugs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done a pass, there are still 2 files I did not review, please check according to the comment on the existing files.

import torch as th
import torch.nn as nn

from .... import to_homogeneous
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use absolute path instead of relative?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute path is indeed more recommended than relative import. Currently relative import is a convention. Maybe we need further discussion to refactor all the code.

python/dgl/nn/pytorch/gt/degree_enc.py Outdated Show resolved Hide resolved
python/dgl/nn/pytorch/gt/biased_mha.py Show resolved Hide resolved
python/dgl/nn/pytorch/gt/biased_mha.py Outdated Show resolved Hide resolved
python/dgl/nn/pytorch/gt/degree_enc.py Outdated Show resolved Hide resolved
where :math:`N` is the number of nodes in the input graph,
:math:`d` is :attr:`lpe_dim`.
"""
pos_enc = th.cat(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does enc in pos_enc stand for? encoding or encoder? Please avoid unconventional abbr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified to pos_encoding

encoder_layer = nn.TransformerEncoderLayer(
d_model=lpe_dim, nhead=n_head, batch_first=True
)
self.pe_encoder = nn.TransformerEncoder(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does pe in pe_encoder stand for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pe stands for positional encoding, which is a common abbr. in this context.

python/dgl/nn/pytorch/gt/lap_enc.py Outdated Show resolved Hide resolved
python/dgl/nn/pytorch/gt/lap_enc.py Outdated Show resolved Hide resolved
python/dgl/nn/pytorch/gt/lap_enc.py Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Mar 31, 2023

Commit ID: 7031161a555cb4cfd284266d0295ff23ce4cee3c

Build ID: 9

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 12, 2023

Commit ID: c9d9901728947de0df1d9e0b9f6b562471ba9bd8

Build ID: 10

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Unit test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 12, 2023

Commit ID: 752d302a04900314606368e33e0efab7a64e9c20

Build ID: 11

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 12, 2023

Commit ID: ef0b374c22a0ff431692f5f44562a86a8945c401

Build ID: 12

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

Copy link
Member

@mufeili mufeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good.

Copy link
Collaborator

@frozenbugs frozenbugs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please apply the lint suggestion.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 13, 2023

Commit ID: 95793078a6625aba0c64fb757d424578b946956e

Build ID: 13

Status: ⚪️ CI test cancelled due to overrun.

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 13, 2023

Commit ID: 7197357

Build ID: 14

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

@rudongyu
Copy link
Collaborator Author

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 13, 2023

Commit ID: 7197357

Build ID: 15

Status: ❌ CI test failed in Stage [Distributed Torch CPU Unit test].

Report path: link

Full logs path: link

@rudongyu
Copy link
Collaborator Author

@dgl-bot

@dgl-bot
Copy link
Collaborator

dgl-bot commented Apr 13, 2023

Commit ID: 7197357

Build ID: 16

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

@rudongyu rudongyu merged commit bb1f885 into dmlc:master Apr 14, 2023
DominikaJedynak pushed a commit to DominikaJedynak/dgl that referenced this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants