Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All-in-one nets #734

Merged
merged 4 commits into from
Jul 29, 2014
Merged

All-in-one nets #734

merged 4 commits into from
Jul 29, 2014

Conversation

jeffdonahue
Copy link
Contributor

This PR lets you specify a phase, stage, and/or level in each of your layers to indicate whether or not these layers should be available. The most obvious application of this is to insert "phase: TRAIN" or "phase: TEST" to special layers that are only used in one or the other so that you need not repeat the common layers; e.g., a data layer which should have a different source at train vs test time, and the accuracy layer which can only be used in the test net (although #686 will fix this whenever someone reviews that ;).

The level parameter, an int, allows you to turn off different layers all at once (all those that have level < the net's setting), e.g. to do layerwise training. The stage parameter, suggested by @shelhamer, (which generalizes the level parameter in a sense), a string, allows you to create arbitrary groups of layers. I'm not sure if I made thee best decisions with this interface design, feel free to make suggestions.

The train_net, test_net, train_net_param, and test_net_param should all work as before.

@jeffdonahue
Copy link
Contributor Author

Whoops, I screwed up by overwriting deploy prototxts (at least mnist.prototxt, which I combined mnist_{train,test}.prototxt into) -- will fix that later. Which reminds me, I haven't come up with a good solution for deploy prototxts (which is why they weren't also merged with the train/test nets). So this can be considered WIP.

@shelhamer
Copy link
Member

Oh sweet! Sorry I dithered on this for so long. I'll review soon and think
about deploy nets.

One could add optional input fields for phase/level/stage.

Le samedi 19 juillet 2014, Jeff Donahue notifications@github.com a écrit :

This PR lets you specify a phase, stage, and/or level in each of your
layers to indicate whether or not these layers should be available. The
most obvious application of this is to insert "phase: TRAIN" or "phase:
TEST" to special layers that are only used in one or the other so that you
need not repeat the common layers; e.g., a data layer which should have a
different source at train vs test time, and the accuracy layer which can
only be used in the test net (although #686
#686 will fix this whenever someone
reviews that ;).

The level parameter, an int, allows you to turn off different layers all
at once (all those that have level < the net's setting), e.g. to do
layerwise training. The stage parameter, suggested by @shelhamer
https://github.com/shelhamer, (which generalizes the level parameter in
a sense), a string, allows you to create arbitrary groups of layers. I'm
not sure if I made thee best decisions with this interface design, feel
free to make suggestions.

The train_net, test_net, train_net_param, and test_net_param should all

work as before.

You can merge this Pull Request by running

git pull https://github.com/jeffdonahue/caffe all-in-one-net

Or view, comment on, or merge it at:

#734
Commit Summary

  • Add phase, level, and stage to LayerParameter to define when each
    layer
  • Incorporate net, net_param, level, stage into Solver/Net.
  • Use unified train/test nets in examples.

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#734.

@shelhamer
Copy link
Member

Solves #57 and completes the model definition improvements for 1.0

Le samedi 19 juillet 2014, Jeff Donahue notifications@github.com a écrit :

This PR lets you specify a phase, stage, and/or level in each of your
layers to indicate whether or not these layers should be available. The
most obvious application of this is to insert "phase: TRAIN" or "phase:
TEST" to special layers that are only used in one or the other so that you
need not repeat the common layers; e.g., a data layer which should have a
different source at train vs test time, and the accuracy layer which can
only be used in the test net (although #686
#686 will fix this whenever someone
reviews that ;).

The level parameter, an int, allows you to turn off different layers all
at once (all those that have level < the net's setting), e.g. to do
layerwise training. The stage parameter, suggested by @shelhamer
https://github.com/shelhamer, (which generalizes the level parameter in
a sense), a string, allows you to create arbitrary groups of layers. I'm
not sure if I made thee best decisions with this interface design, feel
free to make suggestions.

The train_net, test_net, train_net_param, and test_net_param should all

work as before.

You can merge this Pull Request by running

git pull https://github.com/jeffdonahue/caffe all-in-one-net

Or view, comment on, or merge it at:

#734
Commit Summary

  • Add phase, level, and stage to LayerParameter to define when each
    layer
  • Incorporate net, net_param, level, stage into Solver/Net.
  • Use unified train/test nets in examples.

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#734.

@jeffdonahue
Copy link
Contributor Author

In the last four commits, I refactored this to make it somewhat more flexible. I created new proto messages NetState and NetStateRule. NetState contains a phase, level, and any number of stages, and NetStateRule has parallel fields (phase, min_level, max_level, stages) specifying rules on NetState's fields. Each NetStateRule can be thought of as a "conjunction" or logical AND -- i.e. ALL the rules must hold for the NetState for the rule to pass. For disjunctions (logical OR), one should specify multiple NetStateRules, which you can do because each layer now has a repeated NetStateRule enable and repeated NetStateRule disable. In any particular layer, you can specify only one of enable or disable. If neither are specified, the layer is always enabled. If one or more enable rules are specified, the layer is disabled by default and enabled only if the NetState meets one or more of the rules. If one or more disable rules are specified, the layer is enabled by default and disabled only if the NetState meets one or more of the rules.

To handle deploy nets, I also added a new INPUT layer type which should act exactly like an actual net input*, and an extra NetState field bool solver. The default value of solver is false, but the Solver itself will pass a NetState with solver = true. This way, outside code using the Net class in deployment environments doesn't need to be altered to pass a special parameter to ignore solver-specific layers; rather the solver itself is the "special case" and has the burden of passing in the special flag. (Note, however, that the solver can still be run on existing train/test prototxts; it only might become necessary to add enable: { solver: true } (or false) to a layer if you want to write a net prototxt that works for train, test, AND deploy.)

*This might not be true if the existing inputs are handled specially somewhere outside of Net::Init, e.g. in the Python wrapper. Are they?

@jeffdonahue
Copy link
Contributor Author

The only other thing I'd like to do before this gets merged (modulo any revisions due to reviewer comments) is add a few NetTests.

@jeffdonahue
Copy link
Contributor Author

Oh right...the solver: true thing doesn't really work either because there are a bunch of uses of Net elsewhere in the tools that don't use the solver but still rely on the data being provided by the network (e.g. test_net, extract_features, etc.). Obviously I could go through and pass solver: true from all those scripts as well, but that seems a bit hacky.

Maybe what I'll end up doing is removing the bool solver from NetState, thereby requiring users to pass a "stage" if they want to distinguish between "solving time" and "deployment time". Since people may rely on the ability to run the existing deploy nets without any changes, I would also revert the existing deploy nets and leave them as is.

@shelhamer shelhamer self-assigned this Jul 29, 2014
@jeffdonahue
Copy link
Contributor Author

I revised this to exclude the INPUT layer and the "solver" parameter being a part of the NetState(Rule). This is good to go as far as I know, but only (conveniently) solves a part of the overall net consolidation issue -- the need to create separate train and test nets. Technically one could use the custom "stages" provided by this PR to also further differentiate between "solver" vs. "deploy" nets, but then this would require every existing tool to correctly set this setting.

One possible solution to this would be to do something along the lines of @sguada's suggestion in #57 with the "include_net" in the proto. Then you'd make three separate files -- one the "main net" file that has conv1-fc8, one the "solver net" file that includes the main net and also has leveldb layers and loss/accuracy, and last the "deploy net" file that also includes the main net and has inputs and softmax prediction output. That solves the problem of redundancy among files, but doesn't eliminate the annoyance of having to work with many files. Still better than nothing though. But I won't do that for this PR; it's an orthogonal change, and this PR is still useful without it (especially for those of us whose primary caffe workflow is to train/test a net for a minute, check the accuracy, and throw it away).

@jeffdonahue
Copy link
Contributor Author

btw, I'd slightly prefer this one be reviewed/merged before #686 despite the ordering of the PRs. (Also this is easily the 'safer' of the two changes in that it's pretty much an optional proto thing.) But it's probably just 5-10 minutes' worth of extra rebase conflict resolution work if you'd prefer the opposite Evan, so no big deal either way.

@shelhamer
Copy link
Member

I'll review this first. Thanks for letting me know your preference.

On Tuesday, July 29, 2014, Jeff Donahue notifications@github.com wrote:

btw, I'd slightly prefer this one be reviewed/merged before #686
#686 despite the ordering of the PRs.
(Also this is easily the 'safer' of the two changes in that it's pretty
much an optional proto thing.) But it's probably just 5-10 minutes' worth
of extra rebase conflict resolution work if you'd prefer the opposite Evan,
so no big deal either way.


Reply to this email directly or view it on GitHub
#734 (comment).

@shelhamer
Copy link
Member

re: #734 (comment) this is a nice step in combining the definitions, so let's review and merge as-is.

To follow-up after this PR, one of us should:

  • bring back the INPUT layer you introduced, as I will be happy to finally do away with the oddball input fields and make everything a layer
  • make the tools understand deploy nets

<< "Neither train_net nor train_net_param were specified.";
LOG(INFO) << "Creating training net from file: " << param_.train_net();
net_.reset(new Net<Dtype>(param_.train_net()));
LOG(INFO) << "Creating training net specified in train_net_param.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the logging is precise, the two cases for the logic are just net_param inline prototxt vs net file and the (train_)_net_* cases could be combined or conciseness. Do what you like. I take it back, with all the possible nets there could be when combining definitions it's best to be clear.

@shelhamer
Copy link
Member

Nice schema in 1cf797825528cb39a8dc43cbe0e9ef592c610939 -- thanks for the good inline documentation of the Net/NetState/NetStateRule fields.

@shelhamer
Copy link
Member

Ok this all looks good to me. Decide if you want to change any names then merge.

Thanks Jeff! This does make the workflow of defining and training a whole host of nets much neater.

@jeffdonahue
Copy link
Contributor Author

Thanks for the thorough review Evan! I changed the name of FilterParam to FilterNet and enable/disable to include/exclude as you suggested online & offline. Also added a bunch of unit tests, which sadly made this PR a net increase in lines of code :(

Will merge after Travis.

jeffdonahue added a commit that referenced this pull request Jul 29, 2014
@jeffdonahue jeffdonahue merged commit 16b7b25 into BVLC:dev Jul 29, 2014
@jeffdonahue jeffdonahue deleted the all-in-one-net branch July 29, 2014 20:55
@shelhamer shelhamer mentioned this pull request Jul 30, 2014
3 tasks
@shelhamer shelhamer mentioned this pull request Aug 7, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014
@ih4cku ih4cku mentioned this pull request May 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants