Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor data_transform to allow datum, cv:Mat and Blob transformation #1070

Merged
merged 17 commits into from
Oct 4, 2014

Conversation

sguada
Copy link
Contributor

@sguada sguada commented Sep 11, 2014

Move data_mean into the data_transform class to facilitate data transformation.
This way Data layers don't need to hold it and pass it all the time.

@sguada
Copy link
Contributor Author

sguada commented Sep 12, 2014

To prepare for transformation layers #569, modify #954 to manage the data_mean by the transformation.
Could you take a look @shelhamer @jeffdonahue?

@@ -161,6 +161,18 @@ void WindowDataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom,
// label
(*top)[1]->Reshape(batch_size, 1, 1, 1);
this->prefetch_label_.Reshape(batch_size, 1, 1, 1);

// data mean
if (this->layer_param_.window_data_param().has_mean_file()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the WindowDataLayer be handled specially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the transformation done by WindowDataLayer is more complicated and specific to that layer.
But it could be abstracted later on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WindowDataLayer is quite specific and can be left out of standardization for now in how it's coded. It was annoying enough to make me defer data layer factoring before. However, it should have the same transform_param interface this->layer_param_.transform_param().has_mean_file() since there's no reason for the user to know the difference when defining their model.

const int size = datum.channels() * datum.height() * datum.width();
if (data_mean_.count() < size) {
data_mean_.Reshape(1, datum.channels(), datum.height(), datum.width());
LOG(INFO) << "Transform without mean";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand what's going on here -- why reshape the mean inside Transform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to cover the case where there was no mean_file in the parameters, and
therefore was not known the size a priori.

Sergio

2014-09-16 10:47 GMT-07:00 Jeff Donahue notifications@github.com:

In src/caffe/data_transformer.cpp:

  • ReadProtoFromBinaryFileOrDie(mean_file.c_str(), &blob_proto);
  • data_mean_.FromProto(blob_proto);
  • }
    +}

+template
+void DataTransformer::Transform(const int batch_item_id,

  •                                   const Datum& datum,
    
  •                                   Dtype\* transformed_data) {
    
  • CHECK_GT(datum.channels(), 0);
  • CHECK_GE(datum.height(), param_.crop_size());
  • CHECK_GE(datum.height(), param_.crop_size());
  • const int size = datum.channels() * datum.height() * datum.width();
  • if (data_mean_.count() < size) {
  • data_mean_.Reshape(1, datum.channels(), datum.height(), datum.width());
  • LOG(INFO) << "Transform without mean";

I don't think I understand what's going on here -- why reshape the mean
inside Transform?


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/pull/1070/files#r17617821.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Reshape doesn't actually imresize the mean, it will just zero it out, no? When might you want this behavior?

@sguada sguada changed the title Move data_mean into data_transformer Refactor data_transform to allow datum, cv:Mat and Blob transformation Sep 19, 2014
@sguada
Copy link
Contributor Author

sguada commented Sep 20, 2014

By the during my test I verified that LMDB is 10-15% faster than LEVELDB, however in most situations don't matter since the computation takes longer than prefetching.

@jeffdonahue
Copy link
Contributor

@sguada let me know when you're done rebasing and I'll review it.

@sguada
Copy link
Contributor Author

sguada commented Sep 20, 2014

I think it's done and once Travis pass you can review it. Thanks

On Friday, September 19, 2014, Jeff Donahue notifications@github.com
wrote:

@sguada https://github.com/sguada let me know when you're done rebasing
and I'll review it.


Reply to this email directly or view it on GitHub
#1070 (comment).

Sergio

@jeffdonahue
Copy link
Contributor

Er, I actually meant to post that on the matcaffe PR, but I'll take a look at this one as well.

@bhack
Copy link
Contributor

bhack commented Sep 20, 2014

@sguada Do you plan to remove cropping and mirroring from this code in #569? If not this code slowdown Mat copying especially for 1 channel continuous Mat. Using "at" addressing is the slowest method. @shelhamer I think this suffer of the OSX opencv cuda clang issue.

@sguada
Copy link
Contributor Author

sguada commented Sep 20, 2014

@bhack I think your optimization for continuous cv::Mat #1068 could be incorporated here while still allowing to do cropping and mirroring. Or it could be done in two steps, first transform cv::Mat into Blob and then use the Transform Blob to Blob to do that (although it would be a bit more inefficient maybe it would be cleaner).

}
if (Caffe::mode() == Caffe::GPU) {
#ifndef CPU_ONLY
CUDA_CHECK(cudaEventElapsedTime(&elapsed_microseconds_, start_gpu_,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudaEventElapsedTime only measure milliseconds. Should divide by 1000 to have micro seconds

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this note resolved elsewhere?

@bhack
Copy link
Contributor

bhack commented Sep 20, 2014

@sguada this template seems interesting also if this is not exactly the blob case but could be adapted.

@BlGene
Copy link
Contributor

BlGene commented Sep 21, 2014

Hi Guys,

I added my transformation as opencv affine transforms. See here . This has the benefit of flexibly handling things like rotations that are hard to do other wise. If something like this is merge able in this context I would be happy to clean it up a bit.

BR, max

@bhack
Copy link
Contributor

bhack commented Sep 21, 2014

Could be interesting but we need to think how to organize and add multiple transformations and how to expose different transformation parameters in proto

@BlGene
Copy link
Contributor

BlGene commented Sep 21, 2014

I would very much like prefetch_data_ to be promoted to a vector (see here). Maybe something that could be added here. In my cases it's so that I can output different scale images. Could that go here or should I make it a separate pull request.

@bhack
Copy link
Contributor

bhack commented Sep 21, 2014

We need to handle all the opencv transformations pipeline (when user specify a pipeline of serial transformation) before the Mat is copied inside the blob

@sguada
Copy link
Contributor Author

sguada commented Sep 21, 2014

@BlGene I think many transformation are easier done with cv::Mat, and could come later in another PR.
I think we should settle on the basic and then add more transformations later.

Here are some thoughts about it:

  • For current transformations, scale, mean subtraction, cropping and mirroring we can use the current code for datum.
  • For new transformation, rotation, scale, ... we could transform datum into cv::Mat and then apply those transformation, and then transform cv::Mat into Blob
  • To have multiple scale images, I would create another datalayer that have multiple tops and could have a vector of prefectch_data. But mixing this with current datalayers would be a bit difficult.

@bhack do you want to commit an optimized function to copy cv::Mat into Blob<Dtype? Or do you want me to extract that from your code? This could be a protected member of transform_data

@shelhamer
Copy link
Member

@Bigene for multi-scale output / an image pyramid I took the approach of
keeping prefetch_data_ the same and then assigning each scale in the
pyramid its own top blob. That is, you can load the original image once and
then transform it as many times as you like and feed it into multiple tops.

On Sun, Sep 21, 2014 at 11:09 AM, Sergio Guadarrama <
notifications@github.com> wrote:

@BlGene https://github.com/BlGene I think many transformation are
easier done with cv::Mat, and could come later in another PR.
I think we should settle on the basic and then add more transformations
later.

Here are some thoughts about it:

  • For current transformations, scale, mean subtraction, cropping and
    mirroring we can use the current code for datum.
  • For new transformation, rotation, scale, ... we could transform
    datum into cv::Mat and then apply those transformation, and then transform
    cv::Mat into Blob
  • To have multiple scale images, I would create another datalayer that
    have multiple tops and could have a vector of prefectch_data. But mixing
    this with current datalayers would be a bit difficult.

@bhack https://github.com/bhack do you want to commit an optimized
function to copy cv::Mat into Blob<Dtype? Or do you want me to extract that
from your code? This could be a protected member of transform_data


Reply to this email directly or view it on GitHub
#1070 (comment).

@sguada
Copy link
Contributor Author

sguada commented Oct 3, 2014

@shelhamer ready for merge

@shelhamer
Copy link
Member

Awesome, thanks Sergio! I made two tiny changes

  • Include the mean value example as comments in the standard CaffeNet prototxt. As a follow-up we could include it as a stage sometime (but I think the solver proto needs a train_stage first).
  • fix unit8 -> uint8 spelling

sguada added a commit that referenced this pull request Oct 4, 2014
Refactor data_transform to allow datum, cv:Mat and Blob transformation
@sguada sguada merged commit 0ba046b into BVLC:dev Oct 4, 2014
@sguada
Copy link
Contributor Author

sguada commented Oct 4, 2014

@shelhamer thanks for the review and the changes.

We probably want to move to mean_value models and forget about the mean_file.

@shelhamer
Copy link
Member

@sguada agreed about the migration to mean values.

@mlapin mlapin mentioned this pull request Oct 7, 2014
@shelhamer shelhamer mentioned this pull request Oct 10, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Oct 11, 2014
Refactor data_transform to allow datum, cv:Mat and Blob transformation
RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014
Refactor data_transform to allow datum, cv:Mat and Blob transformation
@Mingcong
Copy link

Mingcong commented Nov 6, 2014

@sguada Could you give example about feed cvMat into Caffe?I plan to use Caffe to processing continuous streaming data, such as video.

@sguada sguada deleted the move_data_mean branch November 28, 2014 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants