Training requires too much RAM #285

stweil · 2021-10-02T08:47:25Z

Training with a large number of lines requires a huge amount of RAM (52 GiB RAM for 375000 lines). Loading the samples into memory contributes only a smaller part to this. Most memory is used by the following data processors (CenterNormalizerProcessor, FinalPreparation, BidiTextProcessor, ...). It looks like the memory which is used for these processors is not released in later steps. Maybe that can be changed by throwing away data which is no longer used.

Reducing the memory requirements is especially important for calamari-cross-fold-train which currently has to be restricted to a subset of folds even on a large server with 128 GiB of RAM.

The text was updated successfully, but these errors were encountered:

ChWick · 2021-10-02T09:16:05Z

Do you use data augmentation?

There is a preload flag that can disable preloading the files into RAM:

--train.preload=False
--val.preload=False

stweil · 2021-10-02T09:41:28Z

No, I don't use data augmentation at the moment. I saw the preload flag, but think that normally preloading is good as otherwise the data would have to be loaded again and again for each epoch. Am I wrong?

I think the central point is not preloading, but keeping intermediate data from the data processors. But of course I still don't now the internals of Calamari, so I might be wrong with that assumption.

jacoborrje · 2021-10-24T09:34:30Z

I have had the same issue as @stweil. I tested to run calamari-cross-fold-train with multiple folds on a larger dataset but with only one process in parallel. While monitoring the available free RAM it was possible to see how it decreased noticeably for each fold that was trained. Eventually, the computer ran out of free ram and the process froze. Is this expected behaviour, or could it (as stweil mentions) be because the memory used by the data processors is not released?

As mentioned, reducing memory requirements would be highly useful for training multifold models on larger datasets

andbue · 2021-10-26T09:26:16Z

Since I did not run any larger trainings currently, I have no experience with this memory leak myself. Digging through the code of cross-fold-train, I found that the training processes are run in separate processes, even if they are started from the same thread when max_parallel_models == 1. Could it be that the OS is somehow not freeing the memory even after the training process ends?
A workaround could be to put "process.kill()" at the end of utils.multiprocessing.run. Since I've seen the "Error: Process finished with code..." quite often, the "kill" should probably come before that line.

stweil · 2021-10-26T09:34:31Z

I don't think that there is a memory leak. The data processors require a lot of memory, and I wonder whether that memory is still needed after a processor has done its job. Maybe it is sufficient to reset some Python variables which hold the data of a processor.

andbue · 2021-10-26T09:51:50Z

You're right, killing the processes would only remedy the additional problems @jacoborrje encountered.

bertsky added the performance Concerns the computational efficiency label Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training requires too much RAM #285

Training requires too much RAM #285

stweil commented Oct 2, 2021

ChWick commented Oct 2, 2021

stweil commented Oct 2, 2021

jacoborrje commented Oct 24, 2021

andbue commented Oct 26, 2021

stweil commented Oct 26, 2021

andbue commented Oct 26, 2021

Training requires too much RAM #285

Training requires too much RAM #285

Comments

stweil commented Oct 2, 2021

ChWick commented Oct 2, 2021

stweil commented Oct 2, 2021

jacoborrje commented Oct 24, 2021

andbue commented Oct 26, 2021

stweil commented Oct 26, 2021

andbue commented Oct 26, 2021