lmdb.Error There is not enough space on the disk. #1209

dps42 · 2019-05-23T20:25:21Z

If you're asking about an unexpected problem which you do not know the root cause,
use this template. PLEASE DO NOT DELETE THIS TEMPLATE, FILL IT:

If you already know the root cause to your problem,
feel free to delete everything in this template.

1. What you did:

(1) If you're using examples, what's the command you run:

(2) If you're using examples, have you made any changes to the examples? Paste git status; git diff here:

(3) If not using examples, tell us what you did:

It's always better to copy-paste what you did than to describe them.

Please try to provide enough information to let other reproduce your issues.
Without reproducing the issue, we may not be able to investigate it.

I'm trying to create an LMDB file by following the "Efficient DataFlow" tutorial in Tensorpack.
I was given a dataset with a CSV file with columns in [frame, xmin, xmax, ymin, ymax, class_id] for training an object detection model.

Initially, I was using a reduced version of the file with 300 entries (extracted from a large number of entries) for internal development and debugging. But when I tried to create an LMDB file with LMDBSerializer.save(), following the tensorpack turorial, I got an error saying "lmdb.Error: train_small.lmdb: There is not enough space on the disk".

But I had more than a terabyte of storage left. So I reduced the CSV file entries to only have 10 entries (3 distinct images) but I had the same error.

I will attach the code zip file here.
wow.zip

2. What you observed:

(1) Include the ENTIRE logs here:

It's always better to copy-paste what you observed instead of describing them.

It's always better to paste as much as possible, although sometimes a partial log is OK.

Tensorpack typically saves stdout to its training log.
If stderr is relevant, you can run a command with my_command 2>&1 | tee logs.txt
to save both stdout and stderr to one file.

Traceback (most recent call last):
File "debug2.py", line 111, in
LMDBSerializer.save(df, 'train_small.lmdb')
File "C:\Users\dps42\AppData\Local\Continuum\miniconda3\envs\dps42_dev\lib\site-packages\tensorpack\dataflow\serialize.py", line 52, in save
meminit=False, map_async=True) # need sync() at the end
lmdb.Error: train_small.lmdb: There is not enough space on the disk.

(2) Other observations, if any:
For example, CPU/GPU utilization, output images, tensorboard curves, if relevant to your issue.

3. What you expected, if not obvious.

If you expect higher speed, please read
http://tensorpack.readthedocs.io/tutorial/performance-tuning.html
before posting.

If you expect certain accuracy, only in one of the two conditions can we help with it:
(1) You're unable to reproduce the accuracy documented in tensorpack examples.
(2) It appears to be a tensorpack bug.

Otherwise, how to train a model to certain accuracy is a machine learning question.
We do not answer machine learning questions and it is your responsibility to
figure out how to make your models more accurate.

Since there were only 10 entries in the CSV file and only 3 distinct images, I shouldn't see the message that "There is not enough space on the disk."

4. Your environment:

Paste the output of this command: python -c 'import tensorpack.tfutils as u; print(u.collect_env_info())'
If this command failed, tell us your version of Python/TF/tensorpack.
You can install Tensorpack master by pip install -U git+https://github.com/ppwwyyxx/tensorpack.git
and see if your issue is already solved.
If you're not using tensorpack under a normal command line shell (e.g.,
using an IDE or jupyter notebook), please retry under a normal command line shell.
Include relevant hardware information, e.g. number of GPUs used for training, amount of RAM.

Windows 10. I think no GPU was used at the moment.

You may often want to provide extra information related to your issue, but
at the minimum please try to provide the above information accurately to save effort in the investigation.

The text was updated successfully, but these errors were encountered:

ppwwyyxx · 2019-05-23T23:06:08Z

This is a windows-specific issue, and explained in NVIDIA/DIGITS#206 as well, with a workaround in NVIDIA/DIGITS#209. The solution there is to start with a small map_size on windows, but increase gradually when the dataset is full.

ppwwyyxx · 2019-05-23T23:19:22Z

The above commit implements the logic in NVIDIA/DIGITS#209. Let me know if the issue still exists!

dps42 · 2019-05-24T15:31:37Z

Thanks!

ppwwyyxx · 2019-05-25T08:24:34Z

I don't have a windows to test, but could you check whether similar issue would arise when you use LMDBSerialize.load? There are some similar logic there which might also cause this issue.

dps42 · 2019-05-25T14:12:20Z

Hi thank you for the support.

I was following the "Efficient Dataflow" tutorial on Friday morning, but didn't have an issue with LMDBSerializer.load(), LocallyShuffleData(), and BatchData() back then even on a large lmdb file with several Gigabytes. But I had an error with PrefetchData(), but I don't remember the details.

Unfortunately, I don't have the access right now. Monday will be Memorial day holiday, so I'll go check it again on Tuesday. I noticed that there are many limitations for using Windows like being unable to use ZMQ etc. BTW I don't want to use Windows as a development environment too, but I'm forced to use Windows machines and those are the only ones I'm given.

Again, thanks for the great support.

dps42 · 2019-05-28T14:59:22Z

Hi. I will post a separate issue.

ppwwyyxx closed this as completed in 9b31894 May 23, 2019

ppwwyyxx added the enhancement feature or enhancement label May 23, 2019

dps42 mentioned this issue May 28, 2019

MultiProcessRunner RuntimeError #1219

Closed

hyanwong mentioned this issue Jul 16, 2019

Skip failing windows unittests, and omit max_rss on windows (+ correctly report for OS X) tskit-dev/tsinfer#166

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lmdb.Error There is not enough space on the disk. #1209

lmdb.Error There is not enough space on the disk. #1209

dps42 commented May 23, 2019 •

edited

Loading

ppwwyyxx commented May 23, 2019

ppwwyyxx commented May 23, 2019

dps42 commented May 24, 2019

ppwwyyxx commented May 25, 2019

dps42 commented May 25, 2019

dps42 commented May 28, 2019

lmdb.Error There is not enough space on the disk. #1209

lmdb.Error There is not enough space on the disk. #1209

Comments

dps42 commented May 23, 2019 • edited Loading

1. What you did:

2. What you observed:

3. What you expected, if not obvious.

4. Your environment:

ppwwyyxx commented May 23, 2019

ppwwyyxx commented May 23, 2019

dps42 commented May 24, 2019

ppwwyyxx commented May 25, 2019

dps42 commented May 25, 2019

dps42 commented May 28, 2019

dps42 commented May 23, 2019 •

edited

Loading