-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce extract_features memory footprint, and fix bug for > 1 exported features. #1175
Reduce extract_features memory footprint, and fix bug for > 1 exported features. #1175
Conversation
Per mlapin's issue, I also add a commit to give each feature to be extracted its own WriteBatch instance. |
As mentioned in #1158, The problem could be fixed by:
Maybe some of the devs will comment whether that should be fixed at all and which option is better. |
@mlapin Your comment doesn't seem directly related to this PR. Am I missing something? |
@jyegerlehner You are right, please ignore the comment. No reason to clutter this PR with yet another change. |
const int kMaxKeyStrLength = 100; | ||
char key_str[kMaxKeyStrLength]; | ||
vector<Blob<float>*> input_vec; | ||
vector<int> image_indices(num_features, 0); | ||
|
||
const int DB_BATCH_SIZE = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all users want the same value. What about adding a command line argument for it?
4f671c0
to
05d5d4b
Compare
@kloudkl Command line arg added. |
I see the database abstraction layer PR that was merged fixed the bug. Though it created a new one. I'll close this and resuscitate the selectable db batch size in a new PR. |
I found for what seems like a "normal" scenario, extract_feature was being killed by the kernel presumably due to out-of-memory, after a while of the machine becoming very slow with lots of VM swapping going on. The leveldb::WriteBatch instance is storing up 1000 images' worth of features in memory before committing to the file system (prior to this PR). Which can be a lot of memory. Changing the DB-write-batch-size to 100 allowed extract_feature to complete without any errors, with top showing extract_feature topping out at about 45% memory. This PR changes the DB write batch size to 100.
A bit of arithmetic on the scenario I had:
A machine with 8 GB of RAM (admittedly light). I was writing 4 feature blobs each roughly the same size. One of them was 1024 channels x 12 height x 12 width x 4 bytes/float = 655032 feature bytes per image. With the batch size at 1000, it stores up 1000 of those in the leveldb::WriteBatch, so about 0.65 GB. There were 4 feature blobs (num_features) being exported. And I think leveldb may double buffer what you store, and in any case with other memory being consumed it was over the limit.
Admittedly this isn't a very thorough solution to the problem. I'm merely pushing the tripwire a ways (10x) further out. I don't see any downside to making the change other than since we have more disk operations maybe slightly less efficient. But that doesn't seem significant compared to crashing.