Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow self-contained development of Caffe layers #1896

Open
hannes-brt opened this issue Feb 18, 2015 · 8 comments
Open

Allow self-contained development of Caffe layers #1896

hannes-brt opened this issue Feb 18, 2015 · 8 comments
Assignees

Comments

@hannes-brt
Copy link

Judging from the "Development" page on the wiki, it seems that the only way to implement new Caffe layers is to fork the entire Caffe repository and implement the layer in the same manner as the standard layers. Only the actual implementation of the layer is self-contained in its own file, while the declarations go into the standard header files and the layer registration goes into the caffe.proto file which involves finding an unused ID for the layer.

As far as I can see there are several problems with this approach:

  • Researchers routinely develop new kinds of network layers and with the current approach they need to keep all of their custom layers in the same repository. Often a developer will work on several new layers simultaneously and it would make development of several layers much easier if they could be developed in a self-contained manner in separate repositories.
  • Developers who implement new types of layers will want to publish them on their own terms or often not publish them at all. If it is necessary to fork Caffe for every addition, then developers will have to always publish a complete fork, which might lead to a proliferation of Caffe forks.
  • When a developer of custom layers makes a fork of Caffe they will have to continuously merge new commits back into their own fork. When people publish their own forks and don't continuously merge new code, then there will be many Caffe forks that are out of sync and incompatible with each other. Most obviously, the official Caffe project might assign the protobuffer ID a developer has picked for their layer to a different layer which will create a continuous series of merge conflicts for the layer developer.
  • A user who would like to use custom layers created by different developers will have to integrate all of those layers into a single Caffe fork themselves which will routinely create more merge conflicts that the user has to resolve.

Suggestions:
I think it should be possible to design a system by which developers can create new layers completely separately from Caffe, compile them separately and then register their layer in some configuration file to make Caffe aware of its existence. Ideally, this could be done such as to not require recompilation of Caffe every time a new layer is added.
An obvious issue that will have to be solved is that caffe.proto must dynamically issue layer IDs based on availability. Possibly this could be solved by generating caffe.proto dynamically from a script and a templating engine to issue dynamic layer IDs. Using this approach a developer would create a short file with all the content that needs to go into caffe.proto, register this file with Caffe and then Caffe will use a script to stitch all those snippets together and issue a unique ID to every registered layer.

I think separating development of layers from Caffe itself will give a big boost to the community and spur innovation around Caffe and deep learning. It will allow people to develop new layers (and solvers) without needing to deal with the rest of the Caffe code base, then distribute them on their own terms and receiving proper credit for their work. Caffe could publish an index of community developed modules on their website allowing users to easily find modules for any possible need and give developers a venue to showcase their work.

@bhack
Copy link
Contributor

bhack commented Feb 18, 2015

Take a look at #1849 (comment)

@shelhamer
Copy link
Member

Thanks for the reasoned and detailed post @hannes-brt. We have witnessed the same problems and @jeffdonahue actually has a proof-of-concept arrangement that de-centralizes layer development and assigns layer message IDs through hashing. This is done automatically so that one can collect layers then make.

I agree that reducing the friction for layer development is key to keeping up the pace of progress.

It will allow people to develop new layers (and solvers) without needing to deal with the rest of the Caffe code base, then distribute them on their own terms and receiving proper credit for their work.

De-coupling different parts of development is certainly good, but I'm not sure I follow your point about credit. Versioning keeps attribution and by contributing to a framework I think authors can receive the credit they deserve in having their work reach an established community. Everyone should however package and release their own work as they see fit of course!

@bhack
Copy link
Contributor

bhack commented Feb 20, 2015

I'm really not sure that every layer could be encapsulated. How we will handle layers that require infrastructural changes? Probably with this distributed vision this kind of layers never get support structural PR accepted. See for example filter layer in the actual design status driven by BVLC feedbacks #1482.

@hannes-brt
Copy link
Author

@shelhamer I think both models can exist simultaneously. Distributing a Caffe layer/solver etc outside of the official distribution allows an author to choose the license themselves and for the work to be cited separately. On the other hand inclusion in the official distribution allows for reaching the most users the most easily. Different authors may have different needs/preferences. Moore importantly I think deep learning is becoming more diverse and there will be layers/components for specialized tasks which might not necessarily need to be included in the official distribution.

@shelhamer
Copy link
Member

@hannes-brt agreed! Decoupling development to the degree only gives freedom and convenience.

@bhack it's true that reducing coupling doesn't make development independent as this is still a framework. All the same it's better to do away with all the bookkeeping we can -- like having to assign layer message IDs.

@matthieudelaro
Copy link

I also think that there should be a "Caffe Layer Zoo". It's becoming really annoying to develop layers without being able to share them easily, and to spend 1h+ each time I want to use layers developed in various forks of Caffe. This is the one thing that makes me consider other frameworks.

According to #1167 #1270, it seems like layers are separated enough now. So I'm considering working on a layer zoo. Is there any beginning of implementation I could contribute to?

@shelhamer
Copy link
Member

See #5294 for one approach to this.

@wadefelix
Copy link

I implemented one by Insert the module defined proto into caffe.proto at first, without the param_str.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants