-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
int->size_t to support large datasets more than 2G instances #2473
Conversation
Excited to see this patch pass the Travis tests... I'm running into the same issue! |
tags | ||
TAGS | ||
*.tags | ||
cscope.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be part of this patch.
Thanks for the PR @buaaliyi, and for the reviewing efforts @flx42. I think we do eventually want to increase the blob size limit. A couple of comments:
|
Why not use |
This patch has been updated base on @flx42 's comments Thank you @jeffdonahue and @flx42 for your advices. Let me do a further check, which to fix those places base on the current block max size, besides MemoryDataLayer. |
1552c20
to
2581f18
Compare
I need this same change. I had just filed #3159 in error because I did not search properly, and will close it if I can. I would resolve this by using ssize_t (signed size_t) everywhere you use int to hold a size but are willing to forgo the highest bit in order to get special negative values. Wherever you are using unsigned int, you could use size_t. This should be a straightforward change, at least for g++ on linux, though it will probably touch most files. I can prepare a CPU-tested change set for a pull request if one of the developers is willing to consider it. |
When I was trying to use Caffe to train my large dataset (billions of instances), I found the class 'SyncedMemory' uses data type 'size_t' to alloc memories, while blob.count_ and blob.capacity_ is of type 'int'. As a result, this have cut off the alloc size to less than 2GB, and my experiment was failed due to the pointer overflow.
This patch fixed the data size related types from int to size_t that guarantee to use the correct size on 64bit machine, even though dataset size is over 2 billions.
Thanks for the review.