run vgg_16_cifar wrong. #63

zhuyong0000 · 2016-09-10T09:02:17Z

运行./train.sh 出现下面错误：
I0910 08:11:05.670004 1881 GradientMachine.cpp:134] Initing parameters..
I0910 08:11:06.422868 1881 GradientMachine.cpp:141] Init parameters done.
/usr/local/bin/paddle: line 46: 1881 Killed ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}
No data to plot. Exiting!

运行的环境是paddledev/paddle:cpu-demo-latest，paddle版本为:
PaddlePaddle 0.8.0b, compiled with
with_avx: OFF
with_gpu: OFF
with_double: OFF
with_python: ON
with_rdma: OFF
with_glog: ON
with_gflags: ON
with_metric_learning:
with_timer: OFF
with_predict_sdk:

sss534534 · 2016-09-10T10:16:33Z

Maybe Out of Memory, See in /var/log/messages

reyoung · 2016-09-12T02:05:06Z

It says that Killed. So maybe just out of memory. Please

set the batch_size smaller
set dataprovider's pool size smaller.

We will give a documentation to guide how to use less memory to train the job.

zhangscth · 2016-10-04T15:00:19Z

遇到了相同的问题。。

reyoung · 2016-10-08T08:13:10Z

@zhangscth https://github.com/baidu/Paddle/pull/128/files#diff-d718076d937b4f0e340765bf95c122c3

如果是killed的话，看一下如何减少内存占用。这块文档还在写

qingqing01 · 2016-10-24T13:54:59Z

@zhangscth @zhuyong0000 如何减少内存的文档:http://www.paddlepaddle.org/doc_cn/faq/index.html#id1

在这个demo中您可以尝试减少DataProvider缓存、或者减少batch size.

Fix unittest for predict and evaluate

refactor Object.As to as to remind it lack type check

* add InstanceNorm and LayerNorm Ops * add annotation * transpose * merge develop-ipu * merge develop-ipu * add GetOutputNode * delete layernorm from backend * groupnorm, instancenorm , layernorm * transpose, reshape * reshape, transpose unitest * pre-commit

* Update express_ner example * update run_bigru_crf * fix msra_ner example

add download lib doc

* Update README * Change docs dir to en_US and zh_CN * Add paper reference * Fix link

expand slot's feasign for cvr model in fused_cvm_op

Implement iterator vars fetching in ReduceOp

…orm (#63…" This reverts commit 5f6e9d4.

…or` (#63…" This reverts commit 71fd732.

xpu support check_nan_inf

reyoung changed the title ~~运行vgg_16_cifar 错误~~ run vgg_16_cifar wrong. Sep 12, 2016

qingqing01 mentioned this issue Sep 12, 2016

demo/sentiment$ ./train.sh error #64

Closed

reyoung added the question label Sep 13, 2016

reyoung closed this as completed Nov 29, 2016

qingqing01 pushed a commit to qingqing01/Paddle that referenced this issue Apr 30, 2020

Merge pull request PaddlePaddle#63 from LielinJiang/fix-test-model

61e218d

Fix unittest for predict and evaluate

DemoMoon mentioned this issue Mar 24, 2021

oneDNN 如何能提升DeepSpeech的语音处理性能 #31838

Closed

thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021

Merge pull request PaddlePaddle#63 from Superjomn/refactor/as

3f0371a

refactor Object.As to as to remind it lack type check

wangxicoding pushed a commit to wangxicoding/Paddle that referenced this issue Dec 9, 2021

Update express_ner example (PaddlePaddle#63)

c3c3462

* Update express_ner example * update run_bigru_crf * fix msra_ner example

zhoutianzi666 pushed a commit to zhoutianzi666/Paddle that referenced this issue May 23, 2022

Merge pull request PaddlePaddle#63 from jiweibo/lib_download

fa6ff3b

add download lib doc

AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this issue Sep 19, 2022

Change docs to docs/en_US and docs/zh_CN (PaddlePaddle#63)

1b84c4f

* Update README * Change docs dir to en_US and zh_CN * Add paper reference * Fix link

zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Oct 10, 2023

Merge pull request PaddlePaddle#63 from brightnesss/paddlebox

238e72e

expand slot's feasign for cvr model in fused_cvm_op

lizexu123 pushed a commit to lizexu123/Paddle that referenced this issue Feb 23, 2024

add one-shot nas (PaddlePaddle#63)

e269396

Fridge003 pushed a commit to Fridge003/Paddle that referenced this issue Mar 13, 2024

Merge pull request PaddlePaddle#63 from Fridge003/cinn_lower

98e5195

Implement iterator vars fetching in ReduceOp

Galaxy1458 added a commit that referenced this issue Apr 24, 2024

Revert "【Hackathon 6th Fundable Projects 3 No.199】fluid operator l1_n…

120c96c

…orm (#63…" This reverts commit 5f6e9d4.

SigureMo added a commit that referenced this issue Apr 26, 2024

Revert "[pybind] update py::exception<>::operator() to `py::set_err…

2560030

…or` (#63…" This reverts commit 71fd732.

zmxdream pushed a commit to zmxdream/Paddle that referenced this issue May 16, 2024

Merge pull request PaddlePaddle#63 from paddlebox-xpu/xpu_check_nan_inf

e0f1c18

xpu support check_nan_inf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run vgg_16_cifar wrong. #63

run vgg_16_cifar wrong. #63

zhuyong0000 commented Sep 10, 2016 •

edited by reyoung

Loading

sss534534 commented Sep 10, 2016

reyoung commented Sep 12, 2016

zhangscth commented Oct 4, 2016

reyoung commented Oct 8, 2016

qingqing01 commented Oct 24, 2016

run vgg_16_cifar wrong. #63

run vgg_16_cifar wrong. #63

Comments

zhuyong0000 commented Sep 10, 2016 • edited by reyoung Loading

sss534534 commented Sep 10, 2016

reyoung commented Sep 12, 2016

zhangscth commented Oct 4, 2016

reyoung commented Oct 8, 2016

qingqing01 commented Oct 24, 2016

zhuyong0000 commented Sep 10, 2016 •

edited by reyoung

Loading