Replace add_missing_layers with add_missing_container_layers #169

mert-kurttutan · 2022-09-24T23:40:21Z

Hi,

I in this PR, I tried to get rid of add_missing_layer and make summary primarily based on forward call.

There was a problem if you do not include add_missing_layer. It turns out that if you only use forward (i.e. no add_missing_layer), container modules are being ignored and not included in summary list. This leads to one problematic situation due to wrong parameter count. For instance see below result (obtained without add_missing_layer)

===============================================================================================
Layer (type:depth-idx)                        Output Shape              Param #
===============================================================================================
GenotypeNetwork                               --                        3,200
├─Sequential: 1-1                             [2, 48, 32, 32]           --
│    └─Conv2d: 2-1                            [2, 48, 32, 32]           1,296
│    └─BatchNorm2d: 2-2                       [2, 48, 32, 32]           96
├─ModuleList: 1                               --                        --
│    └─Cell: 2-3                              [2, 32, 32, 32]           --
│    │    └─ReLUConvBN: 3-1                   [2, 32, 32, 32]           1,600
│    │    └─ReLUConvBN: 3-2                   [2, 32, 32, 32]           1,600
===============================================================================================
Total params: 4,592
Trainable params: 4,592
Non-trainable params: 0
Total mult-adds (M): 8.95
===============================================================================================
Input size (MB): 0.02
Forward/backward pass size (MB): 3.67
Params size (MB): 0.02
Estimated Total Size (MB): 3.71
===============================================================================================

In terms of layer config, there is no problem since this is handled by the function layers_to_str in formatting.py. But, the parameter count is wrong since leftover parameters for the entire GenotypeNetwork is 3,200, but it should be 0 --. This occurs because ModuleList: 1 is not contained in summary_list, and Cell: 2-3 is considered to be children of Sequential: 1-1. To resolve this, I added the function add_missing_container_layers. This adds the container Modules (e.g. ModuleDict or ModuleList) used in main module. Once this is used, the result for the same case is correct:


===============================================================================================
Layer (type:depth-idx)                        Output Shape              Param #
===============================================================================================
GenotypeNetwork                               --                        --
├─Sequential: 1-1                             [2, 48, 32, 32]           --
│    └─Conv2d: 2-1                            [2, 48, 32, 32]           1,296
│    └─BatchNorm2d: 2-2                       [2, 48, 32, 32]           96
├─ModuleList: 1-2                             --                        --
│    └─Cell: 2-3                              [2, 32, 32, 32]           --
│    │    └─ReLUConvBN: 3-1                   [2, 32, 32, 32]           1,600
│    │    └─ReLUConvBN: 3-2                   [2, 32, 32, 32]           1,600
│    │    └─ModuleList: 3-3                   --                        --
===============================================================================================
Total params: 4,592
Trainable params: 4,592
Non-trainable params: 0
Total mult-adds (M): 8.95
===============================================================================================
Input size (MB): 0.02
Forward/backward pass size (MB): 3.67
Params size (MB): 0.02
Estimated Total Size (MB): 3.71
===============================================================================================

In addition to making summary based on forward pass, which as far as I can tell is beneficial, this resolves the discrepancies about the ordering in module __init__ and ordering of modules in forward. For instance, after this commit, the following test case gives the intended result. This used to be a problem, to be shown below. Note the order in which modules in self.block0 is defined is given by range_1.


import torch
import torch.nn as nn
import numpy as np

from torchinfo import summary



class RecursiveTest(nn.Module):
    def __init__(self):
        super().__init__()
        self.out_lin0 = nn.Linear(128, 16)

        self.block0 = nn.ModuleDict()
        # range_1 = range(1,4) 
        range_1 = reversed(range(1,4))
        for i in range_1:
            self.block0.add_module(f"in_lin{i}", nn.Linear(16,16)) #nn.ModuleList([nn.Linear(16, 16)]))

        self.block1 = nn.ModuleDict()
        for i in range(4, 7):
            self.block1.add_module(f"in_lin{i}", nn.Linear(16, 16))

        self.out_lin7 = nn.Linear(16, 4)

    def forward(self, x):
        x = torch.relu(self.out_lin0(x))

        for i in range(1, 4):
            x = torch.relu(self.block0[f"in_lin{i}"](x))

        # x = self.block1[f"in_lin{6}"](x)
        # x = self.block0[f"in_lin{2}"](x)

        for i in range(4, 7):
            x = torch.relu(self.block1[f"in_lin{i}"](x))

        x = torch.relu(self.out_lin7(x))

        return x


batch_size = 2
data_shape = (128,)
random_data = torch.rand((batch_size, *data_shape))
my_nn = RecursiveTest()
recursive_summary = summary(
    my_nn, 
    input_data=[random_data], 
    row_settings=('depth', 'var_names'),
    device='cpu',
)

The result before commit (with add_missing_layer included), param count is wrong, it should not be 816:


==========================================================================================
Layer (type (var_name):depth-idx)        Output Shape              Param #
==========================================================================================
RecursiveTest (RecursiveTest)            [2, 4]                    816
├─Linear (out_lin0): 1-1                 [2, 16]                   2,064
├─ModuleDict (block0): 1-2               --                        --
│    └─Linear (in_lin1): 2-1             [2, 16]                   272
│    └─Linear (in_lin2): 2-2             [2, 16]                   272
│    └─Linear (in_lin3): 2-3             [2, 16]                   272
├─ModuleDict (block1): 1                 --                        --
│    └─Linear (in_lin4): 2-4             [2, 16]                   272
│    └─Linear (in_lin5): 2-5             [2, 16]                   272
│    └─Linear (in_lin6): 2-6             [2, 16]                   272
├─Linear (out_lin7): 1-3                 [2, 4]                    68
==========================================================================================
Total params: 3,764
Trainable params: 3,764
Non-trainable params: 0
Total mult-adds (M): 0.01
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.02
Estimated Total Size (MB): 0.02
==========================================================================================

The resultant summary after commit is,:

==========================================================================================
Layer (type (var_name):depth-idx)        Output Shape              Param #
==========================================================================================
RecursiveTest (RecursiveTest)            [2, 4]                    --
├─Linear (out_lin0): 1-1                 [2, 16]                   2,064
├─ModuleDict (block0): 1-2               --                        --
│    └─Linear (in_lin1): 2-1             [2, 16]                   272
│    └─Linear (in_lin2): 2-2             [2, 16]                   272
│    └─Linear (in_lin3): 2-3             [2, 16]                   272
├─ModuleDict (block1): 1-3               --                        --
│    └─Linear (in_lin4): 2-4             [2, 16]                   272
│    └─Linear (in_lin5): 2-5             [2, 16]                   272
│    └─Linear (in_lin6): 2-6             [2, 16]                   272
├─Linear (out_lin7): 1-4                 [2, 4]                    68
==========================================================================================
Total params: 3,764
Trainable params: 3,764
Non-trainable params: 0
Total mult-adds (M): 0.01
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.02
Estimated Total Size (MB): 0.02
==========================================================================================

Note: As you may realize, I used the tracing algorithm in layers_to_str function formatting.py to obtain container modules in add_missing_container_layers.

Looking forward to feedbacks

Instead, use add_missing_container_layers

for more information, see https://pre-commit.ci

codecov · 2022-09-24T23:42:20Z

Codecov Report

Merging #169 (abd9735) into main (70f3ad1) will decrease coverage by 0.30%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #169      +/-   ##
==========================================
- Coverage   97.39%   97.08%   -0.31%     
==========================================
  Files           6        6              
  Lines         575      584       +9     
==========================================
+ Hits          560      567       +7     
- Misses         15       17       +2

Impacted Files	Coverage Δ
torchinfo/torchinfo.py	`97.35% <100.00%> (+0.10%)`	⬆️
torchinfo/formatting.py	`97.56% <0.00%> (-2.44%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

tests/test_output/module_dict.out

TylerYep · 2022-09-25T18:51:25Z

Thanks for the PR! I'll need some time to look at the code more closely but the general direction is correct - a more specific add_missing_container_layers function helps readability and future debugging a lot.

I left a comment on some of the output changes. Most of them are better but one seems worse (missing MaxPool / PReLU layers, just need some clarification there - are they not used in forward at all? Only used in train mode?)

One pre-commit hook is failing, feel free to use # pylint: disable=unused-variable on that line to ignore it for now.

TylerYep · 2022-09-26T00:07:45Z

Looks correct to me, thanks for the contribution! I also added some additional tests to ensure it solves the problems it sets out to achieve.

mert-kurttutan · 2022-09-26T07:09:36Z

Just realized another potential improvement:
Since we are already adding container modules in summary_list, the following problem in layers_to_str in formatting.py seems to be solved. The relevant piece of code in formatting.py

    def layers_to_str(self, summary_list: list[LayerInfo]) -> str:
        """
        Print each layer of the model using a fancy branching diagram.
        This is necessary to handle Container modules that don't have explicit parents.
        """
        new_str = ""

So, I think it is no longer necessary to keep updated hierarchy for parent modules since all the parents (including container modules) are already in the summary list.
What do you think?

TylerYep · 2022-09-26T17:08:52Z

Yep, simplifying that code and making layers_to_str extra simple sounds like a win to me!

mert-kurttutan and others added 2 commits September 25, 2022 00:27

Get rid of add_missing_layer

fb1e4d2

Instead, use add_missing_container_layers

[pre-commit.ci] auto fixes from pre-commit.com hooks

f2be1bd

for more information, see https://pre-commit.ci

TylerYep reviewed Sep 25, 2022

View reviewed changes

tests/test_output/module_dict.out Show resolved Hide resolved

Fix pylint

8eeaa9d

TylerYep mentioned this pull request Sep 25, 2022

Output incorrect when using nn.ModuleList #170

Closed

Add ModuleList test in which ordering matters

b2f2eb5

TylerYep linked an issue Sep 25, 2022 that may be closed by this pull request

Output incorrect when using nn.ModuleList #170

Closed

TylerYep added 2 commits September 25, 2022 16:12

Add 2nd ModuleDict test that uses the other config

4ffb8b5

Fix test for older Pytorch versions

abd9735

TylerYep changed the title ~~Get rid of add_missing_layer~~ Replace add_missing_layers with add_missing_container_layers Sep 25, 2022

TylerYep merged commit c3188cd into TylerYep:main Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace add_missing_layers with add_missing_container_layers #169

Replace add_missing_layers with add_missing_container_layers #169

mert-kurttutan commented Sep 24, 2022 •

edited

Loading

codecov bot commented Sep 24, 2022 •

edited

Loading

TylerYep commented Sep 25, 2022

TylerYep commented Sep 26, 2022

mert-kurttutan commented Sep 26, 2022 •

edited

Loading

TylerYep commented Sep 26, 2022

Replace add_missing_layers with add_missing_container_layers #169

Replace add_missing_layers with add_missing_container_layers #169

Conversation

mert-kurttutan commented Sep 24, 2022 • edited Loading

codecov bot commented Sep 24, 2022 • edited Loading

Codecov Report

TylerYep commented Sep 25, 2022

TylerYep commented Sep 26, 2022

mert-kurttutan commented Sep 26, 2022 • edited Loading

TylerYep commented Sep 26, 2022

mert-kurttutan commented Sep 24, 2022 •

edited

Loading

codecov bot commented Sep 24, 2022 •

edited

Loading

mert-kurttutan commented Sep 26, 2022 •

edited

Loading