Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic input #2355

Closed
nazarblch opened this issue Apr 23, 2015 · 2 comments
Closed

dynamic input #2355

nazarblch opened this issue Apr 23, 2015 · 2 comments

Comments

@nazarblch
Copy link

I've made test with mnist example in which batch_size: 1 to be sure that reshaping works correctly.
But in this mode ( batch_size: 1 ) the classifier has very low accuracy.
Please provide some explanation for this test, is it possible to fix this problem by changing solver config parameters?

I0423 16:16:02.053493 32732 caffe.cpp:99] Use GPU with device ID 0
I0423 16:16:02.189038 32732 caffe.cpp:107] Starting Optimization
I0423 16:16:02.189163 32732 solver.cpp:32] Initializing solver from parameters:
test_iter: 100
test_interval: 10000
base_lr: 0.01
display: 5000
max_iter: 100000
lr_policy: "inv"
gamma: 1e-06
power: 0.98
momentum: 0.9
weight_decay: 0.0005
snapshot: 100000
snapshot_prefix: "/home/nazar/caffe/examples/mnist/lenet"
solver_mode: GPU
net: "/home/nazar/caffe/examples/mnist/lenet_train_test.prototxt"
I0423 16:16:02.189330 32732 solver.cpp:70] Creating training net from net file: /home/nazar/caffe/examples/mnist/lenet_train_test.prototxt
I0423 16:16:02.189879 32732 net.cpp:253] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
I0423 16:16:02.189905 32732 net.cpp:253] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0423 16:16:02.189988 32732 net.cpp:42] Initializing net from parameters:
name: "LeNet"
state {
phase: TRAIN
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "/home/nazar/caffe/examples/mnist/mnist_train_lmdb"
batch_size: 1
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
I0423 16:16:02.190436 32732 net.cpp:59] Memory required for data: 0
I0423 16:16:02.190553 32732 layer_factory.hpp:74] Creating layer mnist
I0423 16:16:02.190611 32732 net.cpp:76] Creating Layer mnist
I0423 16:16:02.190650 32732 net.cpp:334] mnist -> data
I0423 16:16:02.190753 32732 net.cpp:334] mnist -> label
I0423 16:16:02.190780 32732 net.cpp:105] Setting up mnist
I0423 16:16:02.190870 32732 db.cpp:34] Opened lmdb /home/nazar/caffe/examples/mnist/mnist_train_lmdb
I0423 16:16:02.190929 32732 data_layer.cpp:67] output data size: 1,1,28,28
I0423 16:16:02.190950 32732 base_data_layer.cpp:43] Initializing prefetch
I0423 16:16:02.191025 32732 base_data_layer.cpp:45] Prefetch initialized.
I0423 16:16:02.191058 32732 net.cpp:112] Top shape: 1 1 28 28 (784)
I0423 16:16:02.191066 32732 net.cpp:112] Top shape: 1 1 1 1 (1)
I0423 16:16:02.191079 32732 net.cpp:122] Memory required for data: 3140
I0423 16:16:02.191089 32732 layer_factory.hpp:74] Creating layer conv1
I0423 16:16:02.191133 32732 net.cpp:76] Creating Layer conv1
I0423 16:16:02.191148 32732 net.cpp:372] conv1 <- data
I0423 16:16:02.191231 32732 net.cpp:334] conv1 -> conv1
I0423 16:16:02.191253 32732 net.cpp:105] Setting up conv1
I0423 16:16:02.191648 32732 net.cpp:112] Top shape: 1 20 24 24 (11520)
I0423 16:16:02.191665 32732 net.cpp:122] Memory required for data: 49220
I0423 16:16:02.191720 32732 layer_factory.hpp:74] Creating layer pool1
I0423 16:16:02.191756 32732 net.cpp:76] Creating Layer pool1
I0423 16:16:02.191766 32732 net.cpp:372] pool1 <- conv1
I0423 16:16:02.191786 32732 net.cpp:334] pool1 -> pool1
I0423 16:16:02.191802 32732 net.cpp:105] Setting up pool1
I0423 16:16:02.191820 32732 net.cpp:112] Top shape: 1 20 12 12 (2880)
I0423 16:16:02.191828 32732 net.cpp:122] Memory required for data: 60740
I0423 16:16:02.191833 32732 layer_factory.hpp:74] Creating layer conv2
I0423 16:16:02.191848 32732 net.cpp:76] Creating Layer conv2
I0423 16:16:02.191856 32732 net.cpp:372] conv2 <- pool1
I0423 16:16:02.191874 32732 net.cpp:334] conv2 -> conv2
I0423 16:16:02.191892 32732 net.cpp:105] Setting up conv2
I0423 16:16:02.192935 32732 net.cpp:112] Top shape: 1 50 8 8 (3200)
I0423 16:16:02.192946 32732 net.cpp:122] Memory required for data: 73540
I0423 16:16:02.192965 32732 layer_factory.hpp:74] Creating layer pool2
I0423 16:16:02.192984 32732 net.cpp:76] Creating Layer pool2
I0423 16:16:02.192993 32732 net.cpp:372] pool2 <- conv2
I0423 16:16:02.193009 32732 net.cpp:334] pool2 -> pool2
I0423 16:16:02.193027 32732 net.cpp:105] Setting up pool2
I0423 16:16:02.193040 32732 net.cpp:112] Top shape: 1 50 4 4 (800)
I0423 16:16:02.193048 32732 net.cpp:122] Memory required for data: 76740
I0423 16:16:02.193053 32732 layer_factory.hpp:74] Creating layer ip1
I0423 16:16:02.193069 32732 net.cpp:76] Creating Layer ip1
I0423 16:16:02.193078 32732 net.cpp:372] ip1 <- pool2
I0423 16:16:02.193094 32732 net.cpp:334] ip1 -> ip1
I0423 16:16:02.193109 32732 net.cpp:105] Setting up ip1
I0423 16:16:02.208894 32732 net.cpp:112] Top shape: 1 500 1 1 (500)
I0423 16:16:02.208912 32732 net.cpp:122] Memory required for data: 78740
I0423 16:16:02.208936 32732 layer_factory.hpp:74] Creating layer relu1
I0423 16:16:02.208967 32732 net.cpp:76] Creating Layer relu1
I0423 16:16:02.208977 32732 net.cpp:372] relu1 <- ip1
I0423 16:16:02.209012 32732 net.cpp:323] relu1 -> ip1 (in-place)
I0423 16:16:02.209025 32732 net.cpp:105] Setting up relu1
I0423 16:16:02.209033 32732 net.cpp:112] Top shape: 1 500 1 1 (500)
I0423 16:16:02.209041 32732 net.cpp:122] Memory required for data: 80740
I0423 16:16:02.209048 32732 layer_factory.hpp:74] Creating layer ip2
I0423 16:16:02.209064 32732 net.cpp:76] Creating Layer ip2
I0423 16:16:02.209071 32732 net.cpp:372] ip2 <- ip1
I0423 16:16:02.209087 32732 net.cpp:334] ip2 -> ip2
I0423 16:16:02.209103 32732 net.cpp:105] Setting up ip2
I0423 16:16:02.209425 32732 net.cpp:112] Top shape: 1 10 1 1 (10)
I0423 16:16:02.209434 32732 net.cpp:122] Memory required for data: 80780
I0423 16:16:02.209455 32732 layer_factory.hpp:74] Creating layer loss
I0423 16:16:02.209488 32732 net.cpp:76] Creating Layer loss
I0423 16:16:02.209506 32732 net.cpp:372] loss <- ip2
I0423 16:16:02.209527 32732 net.cpp:372] loss <- label
I0423 16:16:02.209561 32732 net.cpp:334] loss -> loss
I0423 16:16:02.209596 32732 net.cpp:105] Setting up loss
I0423 16:16:02.209614 32732 layer_factory.hpp:74] Creating layer loss
I0423 16:16:02.209645 32732 net.cpp:112] Top shape: 1 1 1 1 (1)
I0423 16:16:02.209652 32732 net.cpp:118] with loss weight 1
I0423 16:16:02.209671 32732 net.cpp:122] Memory required for data: 80784
I0423 16:16:02.209678 32732 net.cpp:163] loss needs backward computation.
I0423 16:16:02.209686 32732 net.cpp:163] ip2 needs backward computation.
I0423 16:16:02.209693 32732 net.cpp:163] relu1 needs backward computation.
I0423 16:16:02.209698 32732 net.cpp:163] ip1 needs backward computation.
I0423 16:16:02.209704 32732 net.cpp:163] pool2 needs backward computation.
I0423 16:16:02.209710 32732 net.cpp:163] conv2 needs backward computation.
I0423 16:16:02.209717 32732 net.cpp:163] pool1 needs backward computation.
I0423 16:16:02.209722 32732 net.cpp:163] conv1 needs backward computation.
I0423 16:16:02.209731 32732 net.cpp:165] mnist does not need backward computation.
I0423 16:16:02.209740 32732 net.cpp:201] This network produces output loss
I0423 16:16:02.209767 32732 net.cpp:446] Collecting Learning Rate and Weight Decay.
I0423 16:16:02.209781 32732 net.cpp:213] Network initialization done.
I0423 16:16:02.209787 32732 net.cpp:214] Memory required for data: 80784
I0423 16:16:02.210162 32732 solver.cpp:154] Creating test net (#0) specified by net file: /home/nazar/caffe/examples/mnist/lenet_train_test.prototxt
I0423 16:16:02.210249 32732 net.cpp:253] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
I0423 16:16:02.210396 32732 net.cpp:42] Initializing net from parameters:
name: "LeNet"
state {
phase: TEST
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "/home/nazar/caffe/examples/mnist/mnist_test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
I0423 16:16:02.210849 32732 net.cpp:59] Memory required for data: 0
I0423 16:16:02.210902 32732 layer_factory.hpp:74] Creating layer mnist
I0423 16:16:02.210933 32732 net.cpp:76] Creating Layer mnist
I0423 16:16:02.210947 32732 net.cpp:334] mnist -> data
I0423 16:16:02.210980 32732 net.cpp:334] mnist -> label
I0423 16:16:02.211007 32732 net.cpp:105] Setting up mnist
I0423 16:16:02.211069 32732 db.cpp:34] Opened lmdb /home/nazar/caffe/examples/mnist/mnist_test_lmdb
I0423 16:16:02.211102 32732 data_layer.cpp:67] output data size: 100,1,28,28
I0423 16:16:02.211189 32732 base_data_layer.cpp:43] Initializing prefetch
I0423 16:16:02.211231 32732 base_data_layer.cpp:45] Prefetch initialized.
I0423 16:16:02.211251 32732 net.cpp:112] Top shape: 100 1 28 28 (78400)
I0423 16:16:02.211258 32732 net.cpp:112] Top shape: 100 1 1 1 (100)
I0423 16:16:02.211264 32732 net.cpp:122] Memory required for data: 314000
I0423 16:16:02.211272 32732 layer_factory.hpp:74] Creating layer label_mnist_1_split
I0423 16:16:02.211298 32732 net.cpp:76] Creating Layer label_mnist_1_split
I0423 16:16:02.211309 32732 net.cpp:372] label_mnist_1_split <- label
I0423 16:16:02.211323 32732 net.cpp:334] label_mnist_1_split -> label_mnist_1_split_0
I0423 16:16:02.211343 32732 net.cpp:334] label_mnist_1_split -> label_mnist_1_split_1
I0423 16:16:02.211357 32732 net.cpp:105] Setting up label_mnist_1_split
I0423 16:16:02.211370 32732 net.cpp:112] Top shape: 100 1 1 1 (100)
I0423 16:16:02.211382 32732 net.cpp:112] Top shape: 100 1 1 1 (100)
I0423 16:16:02.211387 32732 net.cpp:122] Memory required for data: 314800
I0423 16:16:02.211393 32732 layer_factory.hpp:74] Creating layer conv1
I0423 16:16:02.211415 32732 net.cpp:76] Creating Layer conv1
I0423 16:16:02.211424 32732 net.cpp:372] conv1 <- data
I0423 16:16:02.211442 32732 net.cpp:334] conv1 -> conv1
I0423 16:16:02.211460 32732 net.cpp:105] Setting up conv1
I0423 16:16:02.211515 32732 net.cpp:112] Top shape: 100 20 24 24 (1152000)
I0423 16:16:02.211525 32732 net.cpp:122] Memory required for data: 4922800
I0423 16:16:02.211546 32732 layer_factory.hpp:74] Creating layer pool1
I0423 16:16:02.211563 32732 net.cpp:76] Creating Layer pool1
I0423 16:16:02.211572 32732 net.cpp:372] pool1 <- conv1
I0423 16:16:02.211588 32732 net.cpp:334] pool1 -> pool1
I0423 16:16:02.211603 32732 net.cpp:105] Setting up pool1
I0423 16:16:02.211616 32732 net.cpp:112] Top shape: 100 20 12 12 (288000)
I0423 16:16:02.211622 32732 net.cpp:122] Memory required for data: 6074800
I0423 16:16:02.211629 32732 layer_factory.hpp:74] Creating layer conv2
I0423 16:16:02.211647 32732 net.cpp:76] Creating Layer conv2
I0423 16:16:02.211658 32732 net.cpp:372] conv2 <- pool1
I0423 16:16:02.211674 32732 net.cpp:334] conv2 -> conv2
I0423 16:16:02.211690 32732 net.cpp:105] Setting up conv2
I0423 16:16:02.212716 32732 net.cpp:112] Top shape: 100 50 8 8 (320000)
I0423 16:16:02.212729 32732 net.cpp:122] Memory required for data: 7354800
I0423 16:16:02.212748 32732 layer_factory.hpp:74] Creating layer pool2
I0423 16:16:02.212764 32732 net.cpp:76] Creating Layer pool2
I0423 16:16:02.212774 32732 net.cpp:372] pool2 <- conv2
I0423 16:16:02.212790 32732 net.cpp:334] pool2 -> pool2
I0423 16:16:02.212803 32732 net.cpp:105] Setting up pool2
I0423 16:16:02.212815 32732 net.cpp:112] Top shape: 100 50 4 4 (80000)
I0423 16:16:02.212822 32732 net.cpp:122] Memory required for data: 7674800
I0423 16:16:02.212828 32732 layer_factory.hpp:74] Creating layer ip1
I0423 16:16:02.212844 32732 net.cpp:76] Creating Layer ip1
I0423 16:16:02.212853 32732 net.cpp:372] ip1 <- pool2
I0423 16:16:02.212868 32732 net.cpp:334] ip1 -> ip1
I0423 16:16:02.212887 32732 net.cpp:105] Setting up ip1
I0423 16:16:02.228613 32732 net.cpp:112] Top shape: 100 500 1 1 (50000)
I0423 16:16:02.228634 32732 net.cpp:122] Memory required for data: 7874800
I0423 16:16:02.228663 32732 layer_factory.hpp:74] Creating layer relu1
I0423 16:16:02.228695 32732 net.cpp:76] Creating Layer relu1
I0423 16:16:02.228714 32732 net.cpp:372] relu1 <- ip1
I0423 16:16:02.228754 32732 net.cpp:323] relu1 -> ip1 (in-place)
I0423 16:16:02.228778 32732 net.cpp:105] Setting up relu1
I0423 16:16:02.228786 32732 net.cpp:112] Top shape: 100 500 1 1 (50000)
I0423 16:16:02.228791 32732 net.cpp:122] Memory required for data: 8074800
I0423 16:16:02.228797 32732 layer_factory.hpp:74] Creating layer ip2
I0423 16:16:02.228819 32732 net.cpp:76] Creating Layer ip2
I0423 16:16:02.228837 32732 net.cpp:372] ip2 <- ip1
I0423 16:16:02.228863 32732 net.cpp:334] ip2 -> ip2
I0423 16:16:02.228880 32732 net.cpp:105] Setting up ip2
I0423 16:16:02.229104 32732 net.cpp:112] Top shape: 100 10 1 1 (1000)
I0423 16:16:02.229112 32732 net.cpp:122] Memory required for data: 8078800
I0423 16:16:02.229132 32732 layer_factory.hpp:74] Creating layer ip2_ip2_0_split
I0423 16:16:02.229146 32732 net.cpp:76] Creating Layer ip2_ip2_0_split
I0423 16:16:02.229154 32732 net.cpp:372] ip2_ip2_0_split <- ip2
I0423 16:16:02.229187 32732 net.cpp:334] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0423 16:16:02.229202 32732 net.cpp:334] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0423 16:16:02.229214 32732 net.cpp:105] Setting up ip2_ip2_0_split
I0423 16:16:02.229224 32732 net.cpp:112] Top shape: 100 10 1 1 (1000)
I0423 16:16:02.229231 32732 net.cpp:112] Top shape: 100 10 1 1 (1000)
I0423 16:16:02.229236 32732 net.cpp:122] Memory required for data: 8086800
I0423 16:16:02.229241 32732 layer_factory.hpp:74] Creating layer accuracy
I0423 16:16:02.229257 32732 net.cpp:76] Creating Layer accuracy
I0423 16:16:02.229272 32732 net.cpp:372] accuracy <- ip2_ip2_0_split_0
I0423 16:16:02.229285 32732 net.cpp:372] accuracy <- label_mnist_1_split_0
I0423 16:16:02.229298 32732 net.cpp:334] accuracy -> accuracy
I0423 16:16:02.229312 32732 net.cpp:105] Setting up accuracy
I0423 16:16:02.229323 32732 net.cpp:112] Top shape: 1 1 1 1 (1)
I0423 16:16:02.229329 32732 net.cpp:122] Memory required for data: 8086804
I0423 16:16:02.229336 32732 layer_factory.hpp:74] Creating layer loss
I0423 16:16:02.229346 32732 net.cpp:76] Creating Layer loss
I0423 16:16:02.229354 32732 net.cpp:372] loss <- ip2_ip2_0_split_1
I0423 16:16:02.229365 32732 net.cpp:372] loss <- label_mnist_1_split_1
I0423 16:16:02.229377 32732 net.cpp:334] loss -> loss
I0423 16:16:02.229389 32732 net.cpp:105] Setting up loss
I0423 16:16:02.229399 32732 layer_factory.hpp:74] Creating layer loss
I0423 16:16:02.229423 32732 net.cpp:112] Top shape: 1 1 1 1 (1)
I0423 16:16:02.229430 32732 net.cpp:118] with loss weight 1
I0423 16:16:02.229441 32732 net.cpp:122] Memory required for data: 8086808
I0423 16:16:02.229447 32732 net.cpp:163] loss needs backward computation.
I0423 16:16:02.229455 32732 net.cpp:165] accuracy does not need backward computation.
I0423 16:16:02.229462 32732 net.cpp:163] ip2_ip2_0_split needs backward computation.
I0423 16:16:02.229468 32732 net.cpp:163] ip2 needs backward computation.
I0423 16:16:02.229475 32732 net.cpp:163] relu1 needs backward computation.
I0423 16:16:02.229480 32732 net.cpp:163] ip1 needs backward computation.
I0423 16:16:02.229485 32732 net.cpp:163] pool2 needs backward computation.
I0423 16:16:02.229490 32732 net.cpp:163] conv2 needs backward computation.
I0423 16:16:02.229496 32732 net.cpp:163] pool1 needs backward computation.
I0423 16:16:02.229502 32732 net.cpp:163] conv1 needs backward computation.
I0423 16:16:02.229507 32732 net.cpp:165] label_mnist_1_split does not need backward computation.
I0423 16:16:02.229514 32732 net.cpp:165] mnist does not need backward computation.
I0423 16:16:02.229519 32732 net.cpp:201] This network produces output accuracy
I0423 16:16:02.229527 32732 net.cpp:201] This network produces output loss
I0423 16:16:02.229547 32732 net.cpp:446] Collecting Learning Rate and Weight Decay.
I0423 16:16:02.229558 32732 net.cpp:213] Network initialization done.
I0423 16:16:02.229563 32732 net.cpp:214] Memory required for data: 8086808
I0423 16:16:02.229625 32732 solver.cpp:42] Solver scaffolding done.
I0423 16:16:02.229656 32732 solver.cpp:222] Solving LeNet
I0423 16:16:02.229662 32732 solver.cpp:223] Learning Rate Policy: inv
I0423 16:16:02.229670 32732 solver.cpp:266] Iteration 0, Testing net (#0)
I0423 16:16:02.229679 32732 net.cpp:636] Copying source layer mnist
I0423 16:16:02.229686 32732 net.cpp:636] Copying source layer conv1
I0423 16:16:02.229697 32732 net.cpp:636] Copying source layer pool1
I0423 16:16:02.229703 32732 net.cpp:636] Copying source layer conv2
I0423 16:16:02.229709 32732 net.cpp:636] Copying source layer pool2
I0423 16:16:02.229715 32732 net.cpp:636] Copying source layer ip1
I0423 16:16:02.229723 32732 net.cpp:636] Copying source layer relu1
I0423 16:16:02.229732 32732 net.cpp:636] Copying source layer ip2
I0423 16:16:02.229737 32732 net.cpp:636] Copying source layer loss
I0423 16:16:03.277840 372 data_layer.cpp:153] Restarting data prefetching from start.
I0423 16:16:03.293973 32732 solver.cpp:315] Test net output #0: accuracy = 0.1136
I0423 16:16:03.294020 32732 solver.cpp:315] Test net output #1: loss = 2.3022 (* 1 = 2.3022 loss)
I0423 16:16:03.295140 32732 solver.cpp:189] Iteration 0, loss = 2.30559
I0423 16:16:03.295174 32732 solver.cpp:204] Train net output #0: loss = 2.30559 (* 1 = 2.30559 loss)
I0423 16:16:03.295197 32732 solver.cpp:470] Iteration 0, lr = 0.01
I0423 16:16:07.800014 32732 solver.cpp:189] Iteration 5000, loss = 87.3365
I0423 16:16:07.800060 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:07.800068 32732 solver.cpp:470] Iteration 5000, lr = 0.00995124
I0423 16:16:12.106976 32732 solver.cpp:266] Iteration 10000, Testing net (#0)
I0423 16:16:12.107017 32732 net.cpp:636] Copying source layer mnist
I0423 16:16:12.107024 32732 net.cpp:636] Copying source layer conv1
I0423 16:16:12.107031 32732 net.cpp:636] Copying source layer pool1
I0423 16:16:12.107036 32732 net.cpp:636] Copying source layer conv2
I0423 16:16:12.107043 32732 net.cpp:636] Copying source layer pool2
I0423 16:16:12.107046 32732 net.cpp:636] Copying source layer ip1
I0423 16:16:12.107061 32732 net.cpp:636] Copying source layer relu1
I0423 16:16:12.107066 32732 net.cpp:636] Copying source layer ip2
I0423 16:16:12.107072 32732 net.cpp:636] Copying source layer loss
I0423 16:16:12.986168 10882 data_layer.cpp:153] Restarting data prefetching from start.
I0423 16:16:13.002249 32732 solver.cpp:315] Test net output #0: accuracy = 0.1009
I0423 16:16:13.002285 32732 solver.cpp:315] Test net output #1: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:13.002842 32732 solver.cpp:189] Iteration 10000, loss = 87.3365
I0423 16:16:13.002871 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:13.002882 32732 solver.cpp:470] Iteration 10000, lr = 0.00990296
I0423 16:16:17.338037 32732 solver.cpp:189] Iteration 15000, loss = 87.3365
I0423 16:16:17.338069 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:17.338079 32732 solver.cpp:470] Iteration 15000, lr = 0.00985515
I0423 16:16:21.653486 32732 solver.cpp:266] Iteration 20000, Testing net (#0)
I0423 16:16:21.653514 32732 net.cpp:636] Copying source layer mnist
I0423 16:16:21.653522 32732 net.cpp:636] Copying source layer conv1
I0423 16:16:21.653528 32732 net.cpp:636] Copying source layer pool1
I0423 16:16:21.653533 32732 net.cpp:636] Copying source layer conv2
I0423 16:16:21.653549 32732 net.cpp:636] Copying source layer pool2
I0423 16:16:21.653554 32732 net.cpp:636] Copying source layer ip1
I0423 16:16:21.653560 32732 net.cpp:636] Copying source layer relu1
I0423 16:16:21.653564 32732 net.cpp:636] Copying source layer ip2
I0423 16:16:21.653570 32732 net.cpp:636] Copying source layer loss
I0423 16:16:22.548477 21043 data_layer.cpp:153] Restarting data prefetching from start.
I0423 16:16:22.564523 32732 solver.cpp:315] Test net output #0: accuracy = 0.1009
I0423 16:16:22.564553 32732 solver.cpp:315] Test net output #1: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:22.565117 32732 solver.cpp:189] Iteration 20000, loss = 87.3365
I0423 16:16:22.565146 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:22.565157 32732 solver.cpp:470] Iteration 20000, lr = 0.00980781
I0423 16:16:26.912025 32732 solver.cpp:189] Iteration 25000, loss = 87.3365
I0423 16:16:26.912066 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:26.912077 32732 solver.cpp:470] Iteration 25000, lr = 0.00976092
I0423 16:16:31.390297 32732 solver.cpp:266] Iteration 30000, Testing net (#0)
I0423 16:16:31.390334 32732 net.cpp:636] Copying source layer mnist
I0423 16:16:31.390341 32732 net.cpp:636] Copying source layer conv1
I0423 16:16:31.390348 32732 net.cpp:636] Copying source layer pool1
I0423 16:16:31.390353 32732 net.cpp:636] Copying source layer conv2
I0423 16:16:31.390358 32732 net.cpp:636] Copying source layer pool2
I0423 16:16:31.390363 32732 net.cpp:636] Copying source layer ip1
I0423 16:16:31.390368 32732 net.cpp:636] Copying source layer relu1
I0423 16:16:31.390383 32732 net.cpp:636] Copying source layer ip2
I0423 16:16:31.390389 32732 net.cpp:636] Copying source layer loss
I0423 16:16:32.264202 31148 data_layer.cpp:153] Restarting data prefetching from start.
I0423 16:16:32.280170 32732 solver.cpp:315] Test net output #0: accuracy = 0.1009
I0423 16:16:32.280194 32732 solver.cpp:315] Test net output #1: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:32.280740 32732 solver.cpp:189] Iteration 30000, loss = 87.3365
I0423 16:16:32.280756 32732 solver.cpp:204] Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0423 16:16:32.280766 32732 solver.cpp:470] Iteration 30000, lr = 0.00971448

@nazarblch
Copy link
Author

solved by decreasing base_lr: 0.0005

@shelhamer
Copy link
Member

Accumulating gradients #1977 can be important to control for reducing the batch size. Increasing the momentum can be effective too. Please ask modeling questions on the caffe-users group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants