Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix regarding #100 #103

Merged
merged 1 commit into from
Feb 13, 2014
Merged

bugfix regarding #100 #103

merged 1 commit into from
Feb 13, 2014

Conversation

Yangqing
Copy link
Member

The bugfix for #100: when checking blobs_lr, also check the size of the parameter's blobs().size(): if size() is nonzero then we need to do backpropagation.

TODO: maybe add a regression test to rule out future bugs. Also the Init() function is growing quite big now.

Yangqing added a commit that referenced this pull request Feb 13, 2014
@Yangqing Yangqing merged commit 0b3f9c8 into BVLC:master Feb 13, 2014
@sguada
Copy link
Contributor

sguada commented Feb 20, 2014

@Yangqing Due to the change in #100 and #103 in default value of blob_lr now all during test and deploy the network assume it needs to do backward propagation (reserving more memory) even it is not going to do it. At least one set blob_lr=0. for all the layers with parameters.

@kloudkl
Copy link
Contributor

kloudkl commented Feb 21, 2014

Because both the Forward and Backward methods use the bottom_vecs_ and top_vecs, I'm afraid there is no way to save memory.

template <typename Dtype>
const vector<Blob<Dtype>*>& Net<Dtype>::ForwardPrefilled() {
  for (int i = 0; i < layers_.size(); ++i) {
    // LOG(ERROR) << "Forwarding " << layer_names_[i];
    layers_[i]->Forward(bottom_vecs_[i], &top_vecs_[i]);
  }
  return net_output_blobs_;
}

template <typename Dtype>
const vector<Blob<Dtype>*>& Net<Dtype>::Forward(
    const vector<Blob<Dtype>*> & bottom) {
  // Copy bottom to internal bottom
  for (int i = 0; i < bottom.size(); ++i) {
    net_input_blobs_[i]->CopyFrom(*bottom[i]);
  }
  return ForwardPrefilled();
}

template <typename Dtype>
Dtype Net<Dtype>::Backward() {
  Dtype loss = 0;
  for (int i = layers_.size() - 1; i >= 0; --i) {
    if (layer_need_backward_[i]) {
      Dtype layer_loss = layers_[i]->Backward(
          top_vecs_[i], true, &bottom_vecs_[i]);
      loss += layer_loss;
    }
  }
  return loss;
}

@Yangqing
Copy link
Member Author

There is no concern on memory consumption as long as you do not invoke
backward(). Note that we probably do want backward function calls during
deploy time (e.g. gradient as saliency).

All the memory chunks are lazy allocated, which is one of the beauty of
caffe: if you don't use cpu, no cpu memory allocated; if you don't use gpu,
no gpu memory allocated; if you don't run backward, no diff allocated.

Yangqing

On Thu, Feb 20, 2014 at 8:32 PM, kloudkl notifications@github.com wrote:

Because both the Forward and Backward methods use the bottom_vecs_ and
top_vecs, I'm afraid there is no way to save memory.

template const vector<Blob>& Net::ForwardPrefilled() {
for (int i = 0; i < layers
.size(); ++i) {
// LOG(ERROR) << "Forwarding " << layer_names_[i];
layers_[i]->Forward(bottom_vecs_[i], &top_vecs_[i]);
}
return net_output_blobs_;}
template const vector<Blob>& Net::Forward(
const vector<Blob
> & bottom) {
// Copy bottom to internal bottom
for (int i = 0; i < bottom.size(); ++i) {
net_input_blobs_[i]->CopyFrom(bottom[i]);
}
return ForwardPrefilled();}
template Dtype Net::Backward() {
Dtype loss = 0;
for (int i = layers
.size() - 1; i >= 0; --i) {
if (layer_need_backward_[i]) {
Dtype layer_loss = layers_[i]->Backward(
top_vecs_[i], true, &bottom_vecs_[i]);
loss += layer_loss;
}
}
return loss;}

Reply to this email directly or view it on GitHubhttps://github.com//pull/103#issuecomment-35698464
.

@kloudkl
Copy link
Contributor

kloudkl commented Feb 21, 2014

The lazy beauties lie in SyncedMemory::to_cpu and SyncedMemory::to_gpu.

@sguada
Copy link
Contributor

sguada commented Feb 21, 2014

@Yangqing thanks for the clarification, it seemed to me that it was using more memory but you are right it is not.
Do you know how to deallocate all the memory when running the matcaffe wrapper, I always get a core dump when I exit matlab, and I think is due to that.

@rbgirshick
Copy link
Contributor

I haven't experienced a core dump when exiting matlab after using the
matcaffe wrapper. It might be a good idea to check if the segfault is
related to one of the modifications you added and then debug that.

On Fri, Feb 21, 2014 at 10:12 AM, Sergio Guadarrama <
notifications@github.com> wrote:

@Yangqing https://github.com/Yangqing thanks for the clarification, it
seemed to me that it was using more memory but you are right it is not.
Do you know how to deallocate all the memory when running the matcaffe
wrapper, I always get a core dump when I exit matlab, and I think is due to
that.

Reply to this email directly or view it on GitHubhttps://github.com//pull/103#issuecomment-35745036
.

http://www.cs.berkeley.edu/~rbg/

@sguada
Copy link
Contributor

sguada commented Feb 25, 2014

@rbgirshick I have double checked with the new #132 and don't get any more core dump. I guess it was probably because my branch was in a mixed state. But if you get any just let me know.

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
lukeyeager pushed a commit to lukeyeager/caffe that referenced this pull request Jan 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants