A Python script for at-a-glance net summary #3090

longjon · 2015-09-19T21:57:02Z

So you've got complicated nets, perhaps generated by pycaffe's net spec, resulting in lengthy and intricate prototxt files. You code carefully, but sometimes you make mistakes... mistakes that result in training the wrong net, mistakes that just weren't obvious from the pages of Python, the rapidly scrolling log files, the endlessly verbose prototxt, the loss that's still kinda falling, the results that are just mediocre...

The aim of this tool is to provide a concise but comprehensive listing of the computation as Caffe sees it, so that you can tell at-a-glance, among other things:

where your losses are, and how they're weighted
if you have disconnected bottoms, or disconnected tops
which layers you've turned learning/finetuning on or off for
the basic connectivity structure and parameters of your net
which layers are in-placed

Here's some example output, for "CaffeNet"'s deploy.prototxt:

$ $CAFFE_ROOT/tools/extra/summarize.py deploy.prototxt
conv1 Convolution   data  -> conv1 11/4 96
relu1 ReLU          conv1 -> conv1
pool1 Pooling       conv1 -> pool1 3/2
norm1 LRN           pool1 -> norm1
conv2 Convolution   norm1 -> conv2 5+2 256/2
relu2 ReLU          conv2 -> conv2
pool2 Pooling       conv2 -> pool2 3/2
norm2 LRN           pool2 -> norm2
conv3 Convolution   norm2 -> conv3 3+1 384
relu3 ReLU          conv3 -> conv3
conv4 Convolution   conv3 -> conv4 3+1 384/2
relu4 ReLU          conv4 -> conv4
conv5 Convolution   conv4 -> conv5 3+1 256/2
relu5 ReLU          conv5 -> conv5
pool5 Pooling       conv5 -> pool5 3/2
fc6   InnerProduct  pool5 -> fc6
relu6 ReLU          fc6   -> fc6
drop6 Dropout       fc6   -> fc6
fc7   InnerProduct  fc6   -> fc7
relu7 ReLU          fc7   -> fc7
drop7 Dropout       fc7   -> fc7
fc8   InnerProduct  fc7   -> fc8
prob  Softmax       fc8   -> prob

Now for train_val.prototxt, showing LR and decay multipliers:

$ $CAFFE_ROOT/tools/extra/summarize.py train_val.prototxt
data     Data                                      -> data, label
data     Data                                      -> data, label
conv1    Convolution     (, x2.0 Dx0.0) data       -> conv1       11/4 96
relu1    ReLU                           conv1      -> conv1
pool1    Pooling                        conv1      -> pool1       3/2
norm1    LRN                            pool1      -> norm1
conv2    Convolution     (, x2.0 Dx0.0) norm1      -> conv2       5+2 256/2
relu2    ReLU                           conv2      -> conv2
pool2    Pooling                        conv2      -> pool2       3/2
norm2    LRN                            pool2      -> norm2
conv3    Convolution     (, x2.0 Dx0.0) norm2      -> conv3       3+1 384
relu3    ReLU                           conv3      -> conv3
conv4    Convolution     (, x2.0 Dx0.0) conv3      -> conv4       3+1 384/2
relu4    ReLU                           conv4      -> conv4
conv5    Convolution     (, x2.0 Dx0.0) conv4      -> conv5       3+1 256/2
relu5    ReLU                           conv5      -> conv5
pool5    Pooling                        conv5      -> pool5       3/2
fc6      InnerProduct    (, x2.0 Dx0.0) pool5      -> fc6
relu6    ReLU                           fc6        -> fc6
drop6    Dropout                        fc6        -> fc6
fc7      InnerProduct    (, x2.0 Dx0.0) fc6        -> fc7
relu7    ReLU                           fc7        -> fc7
drop7    Dropout                        fc7        -> fc7
fc8      InnerProduct    (, x2.0 Dx0.0) fc7        -> fc8
accuracy Accuracy                       fc8, label -> accuracy
loss     SoftmaxWithLoss                fc8, label -> loss

And the example for which I wrote this, showing finetuning of certain layers, Python layers, and loss weight:

data       net.CustomDataLayer                                  -> data
conv1_1    Convolution               (x0.0, x0.0) data          -> conv1_1             3 64
relu1_1    ReLU                                   conv1_1       -> conv1_1
conv1_2    Convolution               (x0.0, x0.0) conv1_1       -> conv1_2             3+1 64
relu1_2    ReLU                                   conv1_2       -> conv1_2
pool1      Pooling                                conv1_2       -> pool1               2/2
conv2_1    Convolution               (x0.0, x0.0) pool1         -> conv2_1             3+1 128
relu2_1    ReLU                                   conv2_1       -> conv2_1
conv2_2    Convolution               (x0.0, x0.0) conv2_1       -> conv2_2             3+1 128
relu2_2    ReLU                                   conv2_2       -> conv2_2
pool2      Pooling                                conv2_2       -> pool2               2/2
conv3_1    Convolution               (x0.0, x0.0) pool2         -> conv3_1             3+1 256
relu3_1    ReLU                                   conv3_1       -> conv3_1
conv3_2    Convolution               (x0.0, x0.0) conv3_1       -> conv3_2             3+1 256
relu3_2    ReLU                                   conv3_2       -> conv3_2
conv3_3    Convolution               (x0.0, x0.0) conv3_2       -> conv3_3             3+1 256
relu3_3    ReLU                                   conv3_3       -> conv3_3
pool3      Pooling                                conv3_3       -> pool3               2/2
conv4_1    Convolution               (x0.0, x0.0) pool3         -> conv4_1             3+1 512
relu4_1    ReLU                                   conv4_1       -> conv4_1
conv4_2    Convolution               (x0.0, x0.0) conv4_1       -> conv4_2             3+1 512
relu4_2    ReLU                                   conv4_2       -> conv4_2
conv4_3    Convolution               (x0.0, x0.0) conv4_2       -> conv4_3             3+1 512
relu4_3    ReLU                                   conv4_3       -> conv4_3
pool4      Pooling                                conv4_3       -> pool4               2/2
conv5_1    Convolution                            pool4         -> conv5_1             3+1 512
relu5_1    ReLU                                   conv5_1       -> conv5_1
conv5_2    Convolution                            conv5_1       -> conv5_2             3+1 512
relu5_2    ReLU                                   conv5_2       -> conv5_2
conv5_3    Convolution                            conv5_2       -> conv5_3             3+1 512
relu5_3    ReLU                                   conv5_3       -> conv5_3
pool5      Pooling                                conv5_3       -> pool5               2/2
fc6        Convolution                            pool5         -> fc6                 7+3 4096
relu6      ReLU                                   fc6           -> fc6
drop6      Dropout                                fc6           -> drop6
fc7        Convolution                            drop6         -> fc7                 1 4096
relu7      ReLU                                   fc7           -> fc7
drop7      Dropout                                fc7           -> drop7
up7        Deconvolution             (x0.0)       drop7         -> up7                 4/2 4096/4096
crop7      Crop                                   up7, conv5_3  -> crop7
score      Convolution                            crop7         -> score               1 21
labels     net.CustomLabelsLayer                  conv5_3       -> labels
score_loss SoftmaxWithLoss                        score, labels -> 1000.0 * score_loss

The actual output uses ANSI color codes, making connectivity more obvious, and pointing out which bottoms and tops are disconnected, e.g., for CaffeNet train_val:

The colors were picked to stand out for a deuteranomalous person using a white-on-black terminal (i.e., me); they might not work for you and aren't very aesthetic (patches welcome!). (Of course, seeing the actual graph would be more ideal, but that's also more challenging to display concisely and clearly or on a terminal.)

This is a first take. Many things that would be nice to display aren't displayed (e.g., note lack of phase information above). I'll probably dogfood this along a bit (but no promises!), feedback and patches are welcome.

Yangqing · 2015-12-09T01:37:56Z

(chatted offline, all loved it, merging :))

A Python script for at-a-glance net summary

watts4speed · 2015-12-09T23:43:22Z

Love the script! By the way, what would be really cool is if you broke out the TRAIN and TEST networks.

[tools] add Python script for at-a-glance prototxt summary

84eb44e

longjon force-pushed the summarize-tool branch from b1dc0d8 to 84eb44e Compare September 20, 2015 04:15

Yangqing added a commit that referenced this pull request Dec 9, 2015

Merge pull request #3090 from longjon/summarize-tool

03a00e8

A Python script for at-a-glance net summary

Yangqing merged commit 03a00e8 into BVLC:master Dec 9, 2015

lukeyeager mentioned this pull request May 19, 2016

Update summarize.py to select different netstate options #4174

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Python script for at-a-glance net summary #3090

A Python script for at-a-glance net summary #3090

longjon commented Sep 19, 2015

Yangqing commented Dec 9, 2015

watts4speed commented Dec 9, 2015

A Python script for at-a-glance net summary #3090

A Python script for at-a-glance net summary #3090

Conversation

longjon commented Sep 19, 2015

Yangqing commented Dec 9, 2015

watts4speed commented Dec 9, 2015