Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading HDF5 / memory data > 2 GB #1470

Open
LawBow opened this issue Nov 23, 2014 · 6 comments
Open

Loading HDF5 / memory data > 2 GB #1470

LawBow opened this issue Nov 23, 2014 · 6 comments

Comments

@LawBow
Copy link

LawBow commented Nov 23, 2014

I had a crash when I load HDF5 data which size is 1.87G. Here is the screen output
But I loaded another file whose size is 500MB, it success. I don't what is the problem here.

I1123 14:42:48.992930 27345 hdf5_data_layer.cpp:77] Number of HDF5 files: 1
F1123 14:42:49.390960 27345 blob.cpp:72] Check failed: data_ 
*** Check failure stack trace: ***
    @     0x7fabc8316b7d  google::LogMessage::Fail()
    @     0x7fabc8318c7f  google::LogMessage::SendToLog()
    @     0x7fabc831676c  google::LogMessage::Flush()
    @     0x7fabc831951d  google::LogMessageFatal::~LogMessageFatal()
    @           0x46f599  caffe::Blob<>::mutable_cpu_data()
    @           0x530d98  caffe::hdf5_load_nd_dataset<>()
    @           0x49a2cf  caffe::HDF5DataLayer<>::LoadHDF5FileData()
    @           0x49979c  caffe::HDF5DataLayer<>::LayerSetUp()
    @           0x52a483  caffe::Net<>::Init()
    @           0x52c51d  caffe::Net<>::Net()
    @           0x55845e  caffe::Solver<>::InitTrainNet()
    @           0x558a2b  caffe::Solver<>::Init()
    @           0x558ec5  caffe::Solver<>::Solver()
    @           0x431b78  caffe::GetSolver<>()
    @           0x42d387  train()
    @           0x427a7b  main
    @     0x7fabc569176d  (unknown)
    @           0x42bd9d  (unknown)
Aborted (core dumped)
@bwahlgreen
Copy link

I am experiencing the exact same error - will be happy to share further details if necessary!

@wangzheallen
Copy link

Same problem and expecting answer

@shelhamer
Copy link
Member

Given the current types Caffe blobs are capped at 2 gb (although it can be raised). Until the HDF5DataLayer learns to prefetch (#1584 (comment)) to have constant memory use, individual h5 files need to fit in the blob limit.
By splitting the h5 and listing each split h5 file in your source list .txt Caffe will iterate through them one-by-one.

Divide the original h5 data along the num / batch dimension (the first) so that each h5 fits < 2 gb.

@shelhamer shelhamer changed the title Error when loading HDF5 data (check failed) Loading HDF5 data > 2 GB Jan 31, 2015
@pannous
Copy link

pannous commented Feb 11, 2015

Wow, good timing.
Ran into this issue, used h5repart -m1g command, didn't work (!?)
Split with small script, problem solved, thanks!
PS: be careful, h5 has to be < 2 gb unzipped!

@snehashis-roy
Copy link

Thank you for the info. I am a newbie in caffe. Can anyone please tell me how to compute the caffe blob size? I have a data layer with size = 80000x3x35x35. Does the 2GB blob size limit apply to the product of all dimensions (i.e. 8000335*35)? Also can you please tell me how to use multiple h5 files? Do I add them line by line in the train.txt?
Thank you for the help.

@duygusar
Copy link

duygusar commented Dec 26, 2017

@shelhamer I write my hdf5 files in chunks, I have tried as little as 60MB each, decreased the batch size (down to 4) and I am -still- getting the same error! I am using a Titan X. More details: #6133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants