-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Error reading from file '.' #480
Comments
And another question is: then I check the directory /media/wangxiuwan/tmp, there is nothing. It is clear that I have misunderstood. Can you explain to me? |
Hi, I would indeed guess that the error is connected to temporary space, so changing to a different folder would be my suggestion. Also, maybe update to current master of marian-dev. That should be version 1.7.8. There might be better error reporting. |
@emjotde Thank you very much. I have update my marian version to marian-dev 1.7.8. And the training is started normally and lasted for 6 hours. If the error "Error reading from file '.'" is thrown out again, then I contact you again. |
Great. I am closing this issue then. Feel free to re-open if you still have problems. |
But do we know where this comes from? Are we failing to detect errors while writing to the temp file? Then that's a bug. |
We changed quite a lot on error reporting, error bits and stream handling between the version that was used and current master. I would suspect a mix of user error like too little temp space and bad reporting behavior in that case of the older Marian version. I would consider this closed unless we get information that there is a bug. The version used was from December last year: |
@emjotde Hi, after 6 hours, the error throwed out again. (base) work@dbcloud-Super-Server:/tmp$ free -h train.log [CALL STACK] ` |
Can we first establish where the “error reading from” comes from? I suspect it is reading a chopped file where the last line does not end in a newline character, and (hopefully) the code keeps reading until it finds a newline. We may need to temporarily change the code to not delete the tmp file, so that we can inspect it.
Then we should establish why the file is truncated. Is the disk full during write, but we don’t catch that, or is it some strange OS-level caching that messes things up (Windows used to cache data written to network and sometimes failed to flush the cache, causing corrupt files; maybe Linux or your version of drivers does some nasty things like this as well).
|
@frankseide I think what you said is very reasonable. Let me try. Can you tell me where is the code to delete temp files and what change is I need to do? I have search global in marian's code, but I am not sure where to make changes. |
@frankseide Maybe you can tell me what is the rule for judging the terminator, how to judge the last line does end in a newline character? I can check my corpus, is there a sentence pair with no terminator? |
The file should end in a LF character, ASCII code 10, hex 0x0a.
|
@frankseide Thank you very much. And you have said that we may Need to temporarily change the code to not delete the tmp file, so that we can inspect it. I would like to ask you where to change? Is it setting the trainging paramters: |
Hi again. Interesting.
|
A comment to the first question: the |
Let's ignore |
@emjotde @snukky Thank you for both of your reply.
mkdir build
(base) work@dbcloud-Super-Server:~$ df -h /tmp |
OK, this is a lot of space. So it should not be a space problem.
We made this the default yesterday, but you might not have that version yet. This will compile with function names and the stack trace should be more informative. |
In any case, no space should have triggered ENOSPC on write, assuming proper error checking. |
"assuming proper error checking", well, that's a strong assumption :) |
One bug is that we do not set the file path in the input stream when handing in a temporary file. That at least explains why the error message is saying '.' (default Pathie path) instead of a proper temporary filename from |
I'll add an option later today to keep temporary files and fix the name issue. Will let you when it's ready to try. |
Branch |
@emjotde Hi, until now, the training is still going on and no error is reported. It seems that the error is really related to temp files. When the training is completed, I will try the cmake method (add -DCMAKE_BUILD_TYPE=Release) and Branch |
Looks quite normal to me. That's a lot of iterations, I would not expect the score to improve. Do you have reason do believe that the results are bad? |
@emjotde And about the train log, we both trained the transformer model with marian and tensor2tensor, which used the same corpus. However, the max bleu of marian is 32 while the max bleu of tensor2tensor is 41. Then I believe the training results of marian are bad. The following is the training curve of tensor2tensor. |
Closing this now due to inactivity (ours). Feel free to reopen. Usually we do not have problems to match T2T performance, no idea where that would come from. On the other hand there were a few bugs in the marian-dev code around that time. Maybe that has solved itself. |
Hi, I have no problem with marian training before(Chinese-English), but recently I changed a larger training corpus ( train.bpe.zh:18G train.bpe.en:20G total 38G), which is always irregular throwing out this error during training. Why? And what should I do to train these corpus normally? Thank you very much.
During training, free -h:
total used free shared buff/cache available
Mem: 125G 53G 2.6G 48M 69G 71G
Swap: 3.8G 2.9G 925M
nvidia-smi:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54 Driver Version: 396.54 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:02:00.0 Off | N/A |
| 59% 80C P2 268W / 250W | 7205MiB / 11178MiB | 78% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:03:00.0 Off | N/A |
| 64% 83C P2 227W / 250W | 7205MiB / 11178MiB | 70% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:82:00.0 Off | N/A |
| 60% 81C P2 217W / 250W | 7205MiB / 11178MiB | 97% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:83:00.0 Off | N/A |
| 66% 83C P2 294W / 250W | 7205MiB / 11178MiB | 79% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 10283 C /media/wangxiuwan/marian/build/marian 7195MiB |
| 1 10283 C /media/wangxiuwan/marian/build/marian 7195MiB |
| 2 10283 C /media/wangxiuwan/marian/build/marian 7195MiB |
| 3 10283 C /media/wangxiuwan/marian/build/marian 7195MiB |
+-----------------------------------------------------------------------------+
train.log:
[2019-08-14 10:55:34] [marian] Marian v1.7.6 02f4af4 2018-12-12 18:51:10 -0800
[2019-08-14 10:55:34] [marian] Running on dbcloud-Super-Server as process 31002 with command line:
[2019-08-14 10:55:34] [marian] /media/wangxiuwan/marian/build/marian --model /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz --type transformer --pretrained-model /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz --train-sets /media/tmxmall/marian_nmt/general.gen.back.0807/middle/train.bpe.zh /media/tmxmall/marian_nmt/general.gen.back.0807/middle/train.bpe.en --max-length 100 --vocabs /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.zh.yml /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.en.yml --mini-batch-fit -w 6000 --maxi-batch 1000 --early-stopping 40 --cost-type=ce-mean-words --valid-freq 5000 --save-freq 5000 --disp-freq 1000 --valid-metrics ce-mean-words perplexity translation --valid-sets /media/tmxmall/marian_nmt/general.gen.back.0807/middle/valid.bpe.zh /media/tmxmall/marian_nmt/general.gen.back.0807/middle/valid.bpe.en --valid-script-path 'bash /media/wangxiuwan/marian/examples/transformer/back_dataset/scripts/validate_zhen.sh' --valid-translation-output /media/wangxiuwan/marian/examples/transformer/back_dataset/tmxmall_valid_data/valid.en.output --quiet-translation --valid-mini-batch 16 --beam-size 6 --normalize 0.6 --overwrite --keep-best --log /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/train.log --valid-log /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/valid.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --devices 0 1 2 3 --sync-sgd --seed 1111 --exponential-smoothing
[2019-08-14 10:55:34] [config] after-batches: 0
[2019-08-14 10:55:34] [config] after-epochs: 0
[2019-08-14 10:55:34] [config] allow-unk: false
[2019-08-14 10:55:34] [config] beam-size: 6
[2019-08-14 10:55:34] [config] best-deep: false
[2019-08-14 10:55:34] [config] clip-gemm: 0
[2019-08-14 10:55:34] [config] clip-norm: 5
[2019-08-14 10:55:34] [config] cost-type: ce-mean-words
[2019-08-14 10:55:34] [config] cpu-threads: 0
[2019-08-14 10:55:34] [config] data-weighting-type: sentence
[2019-08-14 10:55:34] [config] dec-cell: gru
[2019-08-14 10:55:34] [config] dec-cell-base-depth: 2
[2019-08-14 10:55:34] [config] dec-cell-high-depth: 1
[2019-08-14 10:55:34] [config] dec-depth: 6
[2019-08-14 10:55:34] [config] devices:
[2019-08-14 10:55:34] [config] - 0
[2019-08-14 10:55:34] [config] - 1
[2019-08-14 10:55:34] [config] - 2
[2019-08-14 10:55:34] [config] - 3
[2019-08-14 10:55:34] [config] dim-emb: 512
[2019-08-14 10:55:34] [config] dim-rnn: 1024
[2019-08-14 10:55:34] [config] dim-vocabs:
[2019-08-14 10:55:34] [config] - 36000
[2019-08-14 10:55:34] [config] - 34366
[2019-08-14 10:55:34] [config] disp-first: 0
[2019-08-14 10:55:34] [config] disp-freq: 1000
[2019-08-14 10:55:34] [config] disp-label-counts: false
[2019-08-14 10:55:34] [config] dropout-rnn: 0
[2019-08-14 10:55:34] [config] dropout-src: 0
[2019-08-14 10:55:34] [config] dropout-trg: 0
[2019-08-14 10:55:34] [config] early-stopping: 40
[2019-08-14 10:55:34] [config] embedding-fix-src: false
[2019-08-14 10:55:34] [config] embedding-fix-trg: false
[2019-08-14 10:55:34] [config] embedding-normalization: false
[2019-08-14 10:55:34] [config] enc-cell: gru
[2019-08-14 10:55:34] [config] enc-cell-depth: 1
[2019-08-14 10:55:34] [config] enc-depth: 6
[2019-08-14 10:55:34] [config] enc-type: bidirectional
[2019-08-14 10:55:34] [config] exponential-smoothing: 0.0001
[2019-08-14 10:55:34] [config] grad-dropping-momentum: 0
[2019-08-14 10:55:34] [config] grad-dropping-rate: 0
[2019-08-14 10:55:34] [config] grad-dropping-warmup: 100
[2019-08-14 10:55:34] [config] guided-alignment: none
[2019-08-14 10:55:34] [config] guided-alignment-cost: mse
[2019-08-14 10:55:34] [config] guided-alignment-weight: 0.1
[2019-08-14 10:55:34] [config] ignore-model-config: false
[2019-08-14 10:55:34] [config] interpolate-env-vars: false
[2019-08-14 10:55:34] [config] keep-best: true
[2019-08-14 10:55:34] [config] label-smoothing: 0.1
[2019-08-14 10:55:34] [config] layer-normalization: false
[2019-08-14 10:55:34] [config] learn-rate: 0.0003
[2019-08-14 10:55:34] [config] log: /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/train.log
[2019-08-14 10:55:34] [config] log-level: info
[2019-08-14 10:55:34] [config] lr-decay: 0
[2019-08-14 10:55:34] [config] lr-decay-freq: 50000
[2019-08-14 10:55:34] [config] lr-decay-inv-sqrt: 16000
[2019-08-14 10:55:34] [config] lr-decay-repeat-warmup: false
[2019-08-14 10:55:34] [config] lr-decay-reset-optimizer: false
[2019-08-14 10:55:34] [config] lr-decay-start:
[2019-08-14 10:55:34] [config] - 10
[2019-08-14 10:55:34] [config] - 1
[2019-08-14 10:55:34] [config] lr-decay-strategy: epoch+stalled
[2019-08-14 10:55:34] [config] lr-report: true
[2019-08-14 10:55:34] [config] lr-warmup: 16000
[2019-08-14 10:55:34] [config] lr-warmup-at-reload: false
[2019-08-14 10:55:34] [config] lr-warmup-cycle: false
[2019-08-14 10:55:34] [config] lr-warmup-start-rate: 0
[2019-08-14 10:55:34] [config] max-length: 100
[2019-08-14 10:55:34] [config] max-length-crop: false
[2019-08-14 10:55:34] [config] max-length-factor: 3
[2019-08-14 10:55:34] [config] maxi-batch: 1000
[2019-08-14 10:55:34] [config] maxi-batch-sort: trg
[2019-08-14 10:55:34] [config] mini-batch: 64
[2019-08-14 10:55:34] [config] mini-batch-fit: true
[2019-08-14 10:55:34] [config] mini-batch-fit-step: 10
[2019-08-14 10:55:34] [config] mini-batch-words: 0
[2019-08-14 10:55:34] [config] model: /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz
[2019-08-14 10:55:34] [config] multi-node: false
[2019-08-14 10:55:34] [config] multi-node-overlap: true
[2019-08-14 10:55:34] [config] n-best: false
[2019-08-14 10:55:34] [config] no-nccl: false
[2019-08-14 10:55:34] [config] no-reload: false
[2019-08-14 10:55:34] [config] no-restore-corpus: false
[2019-08-14 10:55:34] [config] no-shuffle: false
[2019-08-14 10:55:34] [config] normalize: 0.6
[2019-08-14 10:55:34] [config] optimizer: adam
[2019-08-14 10:55:34] [config] optimizer-delay: 1
[2019-08-14 10:55:34] [config] optimizer-params:
[2019-08-14 10:55:34] [config] - 0.9
[2019-08-14 10:55:34] [config] - 0.98
[2019-08-14 10:55:34] [config] - 1e-09
[2019-08-14 10:55:34] [config] overwrite: true
[2019-08-14 10:55:34] [config] pretrained-model: /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz
[2019-08-14 10:55:34] [config] quiet: false
[2019-08-14 10:55:34] [config] quiet-translation: true
[2019-08-14 10:55:34] [config] relative-paths: false
[2019-08-14 10:55:34] [config] right-left: false
[2019-08-14 10:55:34] [config] save-freq: 5000
[2019-08-14 10:55:34] [config] seed: 1111
[2019-08-14 10:55:34] [config] shuffle-in-ram: false
[2019-08-14 10:55:34] [config] skip: false
[2019-08-14 10:55:34] [config] sqlite: ""
[2019-08-14 10:55:34] [config] sqlite-drop: false
[2019-08-14 10:55:34] [config] sync-sgd: true
[2019-08-14 10:55:34] [config] tempdir: /tmp
[2019-08-14 10:55:34] [config] tied-embeddings: true
[2019-08-14 10:55:34] [config] tied-embeddings-all: false
[2019-08-14 10:55:34] [config] tied-embeddings-src: false
[2019-08-14 10:55:34] [config] train-sets:
[2019-08-14 10:55:34] [config] - /media/tmxmall/marian_nmt/general.gen.back.0807/middle/train.bpe.zh
[2019-08-14 10:55:34] [config] - /media/tmxmall/marian_nmt/general.gen.back.0807/middle/train.bpe.en
[2019-08-14 10:55:34] [config] transformer-aan-activation: swish
[2019-08-14 10:55:34] [config] transformer-aan-depth: 2
[2019-08-14 10:55:34] [config] transformer-aan-nogate: false
[2019-08-14 10:55:34] [config] transformer-decoder-autoreg: self-attention
[2019-08-14 10:55:34] [config] transformer-dim-aan: 2048
[2019-08-14 10:55:34] [config] transformer-dim-ffn: 2048
[2019-08-14 10:55:34] [config] transformer-dropout: 0.1
[2019-08-14 10:55:34] [config] transformer-dropout-attention: 0
[2019-08-14 10:55:34] [config] transformer-dropout-ffn: 0
[2019-08-14 10:55:34] [config] transformer-ffn-activation: swish
[2019-08-14 10:55:34] [config] transformer-ffn-depth: 2
[2019-08-14 10:55:34] [config] transformer-guided-alignment-layer: last
[2019-08-14 10:55:34] [config] transformer-heads: 8
[2019-08-14 10:55:34] [config] transformer-no-projection: false
[2019-08-14 10:55:34] [config] transformer-postprocess: dan
[2019-08-14 10:55:34] [config] transformer-postprocess-emb: d
[2019-08-14 10:55:34] [config] transformer-preprocess: ""
[2019-08-14 10:55:34] [config] transformer-tied-layers:
[2019-08-14 10:55:34] [config] []
[2019-08-14 10:55:34] [config] type: transformer
[2019-08-14 10:55:34] [config] ulr: false
[2019-08-14 10:55:34] [config] ulr-dim-emb: 0
[2019-08-14 10:55:34] [config] ulr-dropout: 0
[2019-08-14 10:55:34] [config] ulr-keys-vectors: ""
[2019-08-14 10:55:34] [config] ulr-query-vectors: ""
[2019-08-14 10:55:34] [config] ulr-softmax-temperature: 1
[2019-08-14 10:55:34] [config] ulr-trainable-transformation: false
[2019-08-14 10:55:34] [config] valid-freq: 5000
[2019-08-14 10:55:34] [config] valid-log: /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/valid.log
[2019-08-14 10:55:34] [config] valid-max-length: 1000
[2019-08-14 10:55:34] [config] valid-metrics:
[2019-08-14 10:55:34] [config] - ce-mean-words
[2019-08-14 10:55:34] [config] - perplexity
[2019-08-14 10:55:34] [config] - translation
[2019-08-14 10:55:34] [config] valid-mini-batch: 16
[2019-08-14 10:55:34] [config] valid-script-path: bash /media/wangxiuwan/marian/examples/transformer/back_dataset/scripts/validate_zhen.sh
[2019-08-14 10:55:34] [config] valid-sets:
[2019-08-14 10:55:34] [config] - /media/tmxmall/marian_nmt/general.gen.back.0807/middle/valid.bpe.zh
[2019-08-14 10:55:34] [config] - /media/tmxmall/marian_nmt/general.gen.back.0807/middle/valid.bpe.en
[2019-08-14 10:55:34] [config] valid-translation-output: /media/wangxiuwan/marian/examples/transformer/back_dataset/tmxmall_valid_data/valid.en.output
[2019-08-14 10:55:34] [config] version: v1.7.6 02f4af4 2018-12-12 18:51:10 -0800
[2019-08-14 10:55:34] [config] vocabs:
[2019-08-14 10:55:34] [config] - /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.zh.yml
[2019-08-14 10:55:34] [config] - /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.en.yml
[2019-08-14 10:55:34] [config] word-penalty: 0
[2019-08-14 10:55:34] [config] workspace: 6000
[2019-08-14 10:55:34] [config] Loaded model has been created with Marian v1.7.6 02f4af4 2018-12-12 18:51:10 -0800
[2019-08-14 10:55:34] Using synchronous training
[2019-08-14 10:55:34] [data] Loading vocabulary from JSON/Yaml file /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.zh.yml
[2019-08-14 10:55:34] [data] Setting vocabulary size for input 0 to 36000
[2019-08-14 10:55:34] [data] Loading vocabulary from JSON/Yaml file /media/wangxiuwan/marian/examples/transformer/back_dataset/model_vocab_big/vocab.en.yml
[2019-08-14 10:55:34] [data] Setting vocabulary size for input 1 to 34366
[2019-08-14 10:55:34] [batching] Collecting statistics for batch fitting with step size 10
[2019-08-14 10:55:34] Compiled without MPI support. Falling back to FakeMPIWrapper
[2019-08-14 10:55:36] [memory] Extending reserved space to 6016 MB (device gpu0)
[2019-08-14 10:55:36] [memory] Extending reserved space to 6016 MB (device gpu1)
[2019-08-14 10:55:37] [memory] Extending reserved space to 6016 MB (device gpu2)
[2019-08-14 10:55:37] [memory] Extending reserved space to 6016 MB (device gpu3)
[2019-08-14 10:55:37] [comm] Using NCCL 2.3.7 for GPU communication
[2019-08-14 10:55:37] [memory] Reserving 305 MB, device gpu0
[2019-08-14 10:55:38] [memory] Reserving 305 MB, device gpu0
[2019-08-14 10:55:46] [batching] Done
[2019-08-14 10:55:47] [memory] Extending reserved space to 6016 MB (device gpu0)
[2019-08-14 10:55:47] [memory] Extending reserved space to 6016 MB (device gpu1)
[2019-08-14 10:55:47] [memory] Extending reserved space to 6016 MB (device gpu2)
[2019-08-14 10:55:47] [memory] Extending reserved space to 6016 MB (device gpu3)
[2019-08-14 10:55:47] [comm] Using NCCL 2.3.7 for GPU communication
[2019-08-14 10:55:47] Loading model from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.orig.npz
[2019-08-14 10:55:47] Loading model from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.orig.npz
[2019-08-14 10:55:48] Loading model from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.orig.npz
[2019-08-14 10:55:48] Loading model from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.orig.npz
[2019-08-14 10:55:49] Loading Adam parameters from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.optimizer.npz
[2019-08-14 10:55:50] [memory] Reserving 152 MB, device gpu0
[2019-08-14 10:55:50] [memory] Reserving 152 MB, device gpu1
[2019-08-14 10:55:50] [memory] Reserving 152 MB, device gpu2
[2019-08-14 10:55:50] [memory] Reserving 152 MB, device gpu3
[2019-08-14 10:55:50] [data] Restoring the corpus state to epoch 1, batch 65000
[2019-08-14 10:55:50] [data] Shuffling files
[2019-08-14 11:00:34] [data] Done reading 183177554 sentences
[2019-08-14 11:13:07] [data] Done shuffling 183177554 sentences to temp files
[2019-08-14 11:22:06] Training started
[2019-08-14 11:22:06] [memory] Reserving 305 MB, device gpu0
[2019-08-14 11:22:07] [memory] Reserving 305 MB, device gpu2
[2019-08-14 11:22:07] [memory] Reserving 305 MB, device gpu1
[2019-08-14 11:22:07] [memory] Reserving 305 MB, device gpu3
[2019-08-14 11:22:07] Loading model from /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz
[2019-08-14 11:22:10] [memory] Reserving 305 MB, device cpu0
[2019-08-14 11:22:10] [memory] Reserving 76 MB, device gpu0
[2019-08-14 11:22:10] [memory] Reserving 76 MB, device gpu1
[2019-08-14 11:22:10] [memory] Reserving 76 MB, device gpu2
[2019-08-14 11:22:10] [memory] Reserving 76 MB, device gpu3
[2019-08-14 11:22:10] [memory] Reserving 305 MB, device gpu3
[2019-08-14 11:22:10] [memory] Reserving 305 MB, device gpu2
[2019-08-14 11:22:10] [memory] Reserving 305 MB, device gpu1
[2019-08-14 11:22:10] [memory] Reserving 305 MB, device gpu0
[2019-08-14 11:27:43] Ep. 1 : Up. 66000 : Sen. 16,738,027 : Cost 3.13637638 : Time 1916.10s : 5001.61 words/s : L.r. 1.4771e-04
[2019-08-14 11:33:18] Ep. 1 : Up. 67000 : Sen. 17,173,219 : Cost 3.11160016 : Time 335.53s : 28835.85 words/s : L.r. 1.4660e-04
[2019-08-14 11:38:55] Ep. 1 : Up. 68000 : Sen. 17,606,919 : Cost 3.10447025 : Time 337.14s : 28981.41 words/s : L.r. 1.4552e-04
[2019-08-14 11:44:31] Ep. 1 : Up. 69000 : Sen. 18,040,301 : Cost 3.09355903 : Time 336.26s : 28664.33 words/s : L.r. 1.4446e-04
[2019-08-14 11:50:07] Ep. 1 : Up. 70000 : Sen. 18,465,808 : Cost 3.09279132 : Time 335.46s : 28678.63 words/s : L.r. 1.4343e-04
[2019-08-14 11:50:07] Saving model weights and runtime parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.orig.npz
[2019-08-14 11:50:11] Saving model weights and runtime parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz
[2019-08-14 11:50:14] Saving Adam parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.optimizer.npz
[2019-08-14 11:50:28] Saving model weights and runtime parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.best-ce-mean-words.npz
[2019-08-14 11:50:31] [valid] Ep. 1 : Up. 70000 : ce-mean-words : 1.93185 : new best
[2019-08-14 11:50:37] Saving model weights and runtime parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.best-perplexity.npz
[2019-08-14 11:50:40] [valid] Ep. 1 : Up. 70000 : perplexity : 6.90226 : new best
[2019-08-14 11:53:29] Saving model weights and runtime parameters to /media/wangxiuwan/marian/examples/transformer/back_dataset/model_zhen/model.npz.best-translation.npz
[2019-08-14 11:53:32] [valid] Ep. 1 : Up. 70000 : translation : 26.8 : new best
[2019-08-14 11:59:09] Ep. 1 : Up. 71000 : Sen. 18,895,607 : Cost 3.08209920 : Time 542.52s : 17812.32 words/s : L.r. 1.4241e-04
[2019-08-14 12:04:48] Ep. 1 : Up. 72000 : Sen. 19,330,863 : Cost 3.07790542 : Time 338.44s : 28752.45 words/s : L.r. 1.4142e-04
[2019-08-14 12:08:34] Error: Error reading from file '.'
[2019-08-14 12:08:34] Error: Aborted from marian::io::InputFileStream& marian::io::getline(marian::io::InputFileStream&, std::__cxx11::string&) in /media/wangxiuwan/marian/src/common/file_stream.h:218
[CALL STACK]
[0x5b3f82]
[0x5b49f5]
[0x5a58cf]
[0x51638d]
[0x5171cb]
[0x517bae]
[0x43fab9]
[0x7f57f98d9a99] + 0xea99
[0x439142]
[0x440ee1]
[0x468d04]
[0x7f57f93f9c80] + 0xb8c80
[0x7f57f98d26ba] + 0x76ba
[0x7f57f8b5f41d] clone + 0x6d
The text was updated successfully, but these errors were encountered: