Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor LayerParameter into per-layer Parameter messages #219

Merged
merged 60 commits into from
Mar 28, 2014

Conversation

jeffdonahue
Copy link
Contributor

Addresses #208. This changes the way net protos are specified. Before:

layers {
  layer {
    name: "conv1"
    type: "conv"
    num_output: 96
    kernelsize: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.
    }
    blobs_lr: 1.
    blobs_lr: 2.
    weight_decay: 1.
    weight_decay: 0.
  }
  bottom: "data"
  top: "conv1"
}

After:

layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "data"
  top: "conv1"
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

The most notable change is that we have a message, e.g., convolution_param. for each layer type that has its own parameters. I also turned the layer types from a string into an enum to catch spelling errors etc. at compile time.

This should be fully backward compatible with old model protos - the Net::Net(const string& param_file) constructor will attempt to load the param_file using the new proto specification, but if that fails will instead try to load it with the old ("v0") proto specification and then convert it to the new format at runtime, printing a warning that you should convert your nets to the new format.

There is also a script tools/upgrade_net_proto.cpp which converts the a v0-formatted proto to the new format and saves the new format to a file.

Both the converter script and the Net constructor will first pass the v0 formatted proto through a function UpgradeV0Padding that turns padding layers into pad-aware conv layers (assuming they are passed into conv layers; errors if not), and then another function that upgrades to the new proto format.

@shelhamer
Copy link
Member

This is awesome! I like that 0def06d kicks it off with style.

This also addresses #170.

@sguada could you try the upgrade tool on some of your models, just to double-check?

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

@jeffdonahue great job!

Although so far in a clean directory it doesn't compile,
src/caffe/layers/hdf5_data_layer.cpp:80:23: error: ‘batchsize’ was not declared in this scope

I guess due to the changes now other layers that assumed that some params were available in LayerParams now they are not.

I will look into the code with more care, once I can compile it.

@jeffdonahue
Copy link
Contributor Author

Sorry about that, forgot to commit the last set of files I had to change after rebasing to get this to compile. It should compile with the commit I just pushed.

Other than @sguada and anyone else making sure this works with their existing models, I still want to try a couple things out before we consider merging this - e.g. make sure finetuning an old model still works.

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

Great, @jeffdonahue now code compiles and pass all test but the one about hdf5_data_layer

[  FAILED  ] 2 tests, listed below:
[  FAILED  ] HDF5DataLayerTest/0.TestRead, where TypeParam = float
[  FAILED  ] HDF5DataLayerTest/1.TestRead, where TypeParam = double

@shelhamer I think in with this case and #209 we should merge them first in a different branch, and make sure everything works well before merging into dev.

@jeffdonahue
Copy link
Contributor Author

Whoops, HDF5 test failure was a merge conflict problem I think. Passes with latest commit.

@shelhamer
Copy link
Member

@sguada I agree it is important to test how major PRs combine. You can check out a PR as a branch for testing with hub. You can then do merges and such with other branches to see how they integrate.

Once we have tried the joint merges ourselves I think we can merge to dev and continue to test before the release to master.

@sguada
Copy link
Contributor

sguada commented Mar 17, 2014

Totally agree, @shelhamer I using hub as you showed me before to do that.

What I meant is that we usually test PR in isolation, and that is fine, but for these 2 big ones, it will affect any other PR on the way. So we can either merge the other PR before and then adjust #219 and #209 or we will have to adjust all other pending PRs to the new formats.

@shelhamer
Copy link
Member

I vote for these core changes #219 and #209 going in first and other PRs
rebasing on them. Individually they should have only a little work to do.

@jeffdonahue, that should be less work for you so we can do our testing
then merge this.

Le lundi 17 mars 2014, Sergio Guadarrama notifications@github.com a
écrit :

Totally agree, @shelhamer https://github.com/shelhamer I using hub as
you showed me before to do that.

What I meant is that we usually test PR in isolation, and that is fine,
but for these 2 big ones, it will affect any other PR on the way. So we can
either merge the other PR before and then adjust #219https://github.com/BVLC/caffe/pull/219and
#209 #209 or we will have to adjust
all other pending PRs to the new formats.


Reply to this email directly or view it on GitHubhttps://github.com//pull/219#issuecomment-37859411
.

@jeffdonahue
Copy link
Contributor Author

Works for me (obviously) but am also happy to rebase and tweak on any other PRs you want to merge before.

@shelhamer
Copy link
Member

The back-merge of historical PRs to master into dev caused some conflicts (most likely in caffe.proto I'd imagine).

@jeffdonahue
Copy link
Contributor Author

Understand if we're still waiting to merge but FYI this and #209 (forward pass loss) are now rebased on dev.

} else {
LOG(FATAL) << "Unknown layer name: " << type;
case LayerParameter_LayerType_NONE:
LOG(FATAL) << "Layer " << name << " has unspecified type.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine, LOG(FATAL) causes a crash.

@sergeyk
Copy link
Contributor

sergeyk commented Mar 20, 2014

Looks great, I think we should go ahead and merge.
One thing: we should have a documentation page explaining the layer parameters.
Jeff, could you contribute the first such document, for convolution layers?
I'd put it in docs/layers.md; all layers will be briefly documented there.

@shelhamer
Copy link
Member

#245's window data layer params will need to be added and the padding layer should be dropped, as discussed offline (Jeff's auto-conversion makes the deprecation unnecessary).

@sguada
Copy link
Contributor

sguada commented Mar 21, 2014

@jeffdonahue please and name to blobs params

@jeffdonahue
Copy link
Contributor Author

@sguada huh? I will rebase this, add window data layer, do a bit of testing (beyond the existing unit tests), and merge - hopefully all today.

@sguada
Copy link
Contributor

sguada commented Mar 22, 2014

Sorry Jeff, I meant to say that Blobs could have name, as a parameter. Which is defined by the layer. It will make things a bit easier to manage them.

message BlobProto {
optional int32 num = 1 [default = 0];
optional int32 channels = 2 [default = 0];
optional int32 height = 3 [default = 0];
optional int32 width = 4 [default = 0];
repeated float data = 5 [packed=true];
repeated float diff = 6 [packed=true];
optional string name = 7 [default =""];
}

@sguada
Copy link
Contributor

sguada commented Mar 22, 2014

Does the upgrade_net_proto also upgrade the prototxt files? Or only the network params in the proto file.

cropsize: 227
}
name: "data"
type: IMAGE_DATA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here there are missing the IMAGE_DATA params

@jeffdonahue
Copy link
Contributor Author

@sguada great catch on the missing IMAGE_DATA params - I just added support for that layer type and I guess I missed its parameters. I'll fix that and step through every other layer type (again) to make sure I didn't miss anything else.

Aside from that this actually needs a bit more work - I need to go through all the tools/*.cpp to make them all be backwards compatible with V0 net param files by calling the Net constructor which takes a filename string. (So far I only changed train_net's entry point - from grepping ReadProtoFromTextFile and ReadProtoFromBinaryFile there are several more.)

Does the upgrade_net_proto also upgrade the prototxt files? Or only the network params in the proto file.

Not sure what you mean by this - the changes to examples/feature_extraction/imagenet_val.prototxt and other prototxts changed show what the upgrade_net_proto script does (they were all done automatically just by using upgrade_net_proto, except the script doesn't retain comments so I manually re-added a couple of those).

@sguada
Copy link
Contributor

sguada commented Mar 22, 2014

@jeffdonahue, I think we should provide the script to transform the prototxt and snapshots but I don't think we should change all tools/*.cpp be aware of the changes. A net should be able to be constructed given a Network params, although you can add the option to read it from a file.

Yes I meant if changes to imagent_val.prototxt and others were automatic or manual.

@shelhamer
Copy link
Member

@sguada I'm confused why you don't want to upgrade the tools. @jeffdonahue's new ReadProto* methods transparently load the old or new schema as needed and convert. One can still construct a net from a NetParameter otherwise loaded however you'd like.

I agree with Jeff's plan in #219 (comment) for finishing this.

@sguada
Copy link
Contributor

sguada commented Mar 22, 2014

Hi Jeff,
I was thinking that conversion is done outside of the network so it doesn't
need to know about it.
So there are two options. Choose the one you think will be easier to
maintain in the future.

On Saturday, March 22, 2014, Evan Shelhamer notifications@github.com
wrote:

@sguada https://github.com/sguada I'm confused why you don't want to
upgrade the tools. @jeffdonahue https://github.com/jeffdonahue's new
ReadProto* methods transparently load the old or new schema as needed and
convert. One can still construct a net from a NetParameter otherwise loaded
however you'd like.

I agree with Jeff's plan in #219 (comment)#219 (comment) finishing this.

Reply to this email directly or view it on GitHubhttps://github.com//pull/219#issuecomment-38361258
.

Sergio

@shelhamer
Copy link
Member

Almost perfect–but there's some chaff to clean up:

1322fa3 adds symlinks to caffe model params and solver state and `examples/imagenet/finetune*.sh scripts that I think are accidental. Please rewrite that commit so that these aren't versioned.

@shelhamer
Copy link
Member

I've read over everything and tests pass so I'll merge as soon as those files are dropped.

Sweet work Jeff!

@jeffdonahue
Copy link
Contributor Author

Whoops, great catch - got lazy and git added that whole directory and (apparently) didn't inspect results closely. History rewritten to exclude those files. Thanks for reviewing and all your help along the way Evan!

shelhamer added a commit that referenced this pull request Mar 28, 2014
Refactor LayerParameter into per-layer Parameter messages and add tools for seamless proto upgrade.
@shelhamer shelhamer merged commit 06bef17 into BVLC:dev Mar 28, 2014
@shelhamer
Copy link
Member

layer cake

Lovely layers.

@jeffdonahue jeffdonahue deleted the refactor-layerparam-proto branch March 28, 2014 08:01
@shelhamer shelhamer mentioned this pull request Mar 28, 2014
3 tasks
@shelhamer shelhamer mentioned this pull request May 20, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
Refactor LayerParameter into per-layer Parameter messages and add tools for seamless proto upgrade.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants