Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The data layer is also plagued with the uninitialized prefetch rng core dump bug #553

Closed
kloudkl opened this issue Jun 27, 2014 · 2 comments

Comments

@kloudkl
Copy link
Contributor

kloudkl commented Jun 27, 2014

In #508, I thought that the DataLayer and the WindowDataLayer was free of this issue. I was wrong.

In the DataLayer, tools/test_net.bin core dumped again.

 (mirror && layer->PrefetchRand() % 2) {

As before, the trouble was caused by the complex condition to decide whether to initialize the prefetch_rng_. Maybe such a condition is not necessary at all.

const bool prefetch_needs_rand = (phase_ == Caffe::TRAIN) &&
      (this->layer_param_.data_param().mirror() ||
       this->layer_param_.data_param().crop_size());

I do not open a new PR to fix it directly because these bugs reflect the more fundamental design defects in the data layers. The logic in the data layers has become too intricate to reason, develop and debug intuitively. When the ImageDataLayer and the WindowDataLayer were written, many lines were in fact copied from the DataLayer. Therefore, it is not surprising that they suffer from the same segmentation fault bug. The duplicated codes may also include other complicated issues not exposed yet.

In the past, several PRs including #407 have tried to separate the data IO and processing modules. But unfortunately, none of them completed the mission. It is very important to refactor the data layers as soon as possible to allow to add more data sources and formats with little to no overhead.

@jeffdonahue
Copy link
Contributor

I don't think I've personally seen the behavior, but if you send a PR that removes the prefetch_needs_rand logic from all the DataLayer and just always initializes the RNG, I'd be in favor of merging. I wrote that logic but I agree that it's difficult to maintain & reason about and probably isn't worth the small savings.

@shelhamer
Copy link
Member

This was fixed in the DataLayer by #954 by relying on Caffe::phase() when the data transformer is created. A fix for WindowDataLayer is in my follow-up to refactor the other data layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants