Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce extract_features memory footprint, and fix bug for > 1 exported features. #1175

Closed

Conversation

jyegerlehner
Copy link
Contributor

I found for what seems like a "normal" scenario, extract_feature was being killed by the kernel presumably due to out-of-memory, after a while of the machine becoming very slow with lots of VM swapping going on. The leveldb::WriteBatch instance is storing up 1000 images' worth of features in memory before committing to the file system (prior to this PR). Which can be a lot of memory. Changing the DB-write-batch-size to 100 allowed extract_feature to complete without any errors, with top showing extract_feature topping out at about 45% memory. This PR changes the DB write batch size to 100.

A bit of arithmetic on the scenario I had:
A machine with 8 GB of RAM (admittedly light). I was writing 4 feature blobs each roughly the same size. One of them was 1024 channels x 12 height x 12 width x 4 bytes/float = 655032 feature bytes per image. With the batch size at 1000, it stores up 1000 of those in the leveldb::WriteBatch, so about 0.65 GB. There were 4 feature blobs (num_features) being exported. And I think leveldb may double buffer what you store, and in any case with other memory being consumed it was over the limit.

Admittedly this isn't a very thorough solution to the problem. I'm merely pushing the tripwire a ways (10x) further out. I don't see any downside to making the change other than since we have more disk operations maybe slightly less efficient. But that doesn't seem significant compared to crashing.

@jyegerlehner jyegerlehner changed the title Extract features in smaller leveldb batches so we don't crash (as often) Reduce extract_features memory footprint, and fix bug for > 1 exported features. Oct 1, 2014
@jyegerlehner
Copy link
Contributor Author

Per mlapin's issue, I also add a commit to give each feature to be extracted its own WriteBatch instance.

@mlapin
Copy link

mlapin commented Oct 8, 2014

As mentioned in #1158, convert_imageset and extract_features use different key values. This may or may not be the desired behavior, but the way extract_features creates key values can easily lead to bugs. E.g. when the extracted features are further used in python where the leveldb is iterated by keys in alphabetical order.

The problem could be fixed by:

  • either using the same key as in the backend;
  • or adding leading zeros to the current index values.

Maybe some of the devs will comment whether that should be fixed at all and which option is better.

@jyegerlehner
Copy link
Contributor Author

@mlapin Your comment doesn't seem directly related to this PR. Am I missing something?

@mlapin
Copy link

mlapin commented Oct 10, 2014

@jyegerlehner You are right, please ignore the comment. No reason to clutter this PR with yet another change.

const int kMaxKeyStrLength = 100;
char key_str[kMaxKeyStrLength];
vector<Blob<float>*> input_vec;
vector<int> image_indices(num_features, 0);

const int DB_BATCH_SIZE = 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all users want the same value. What about adding a command line argument for it?

@jyegerlehner jyegerlehner force-pushed the extract-features-in-smaller-batches branch from 4f671c0 to 05d5d4b Compare October 14, 2014 02:27
@jyegerlehner
Copy link
Contributor Author

@kloudkl Command line arg added.

@jyegerlehner
Copy link
Contributor Author

I see the database abstraction layer PR that was merged fixed the bug. Though it created a new one. I'll close this and resuscitate the selectable db batch size in a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants