Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caffemodel snapshots with shared weights don't have multiple copies #2946

Closed
wants to merge 51 commits into from
Closed

Caffemodel snapshots with shared weights don't have multiple copies #2946

wants to merge 51 commits into from

Conversation

danielgordon10
Copy link

The owners of the shared weights keep the parameters around, but the non-owners don't keep additional copies.

@ronghanghu
Copy link
Member

Personally I believe this is exactly the desired behavior: the non-owners shouldn't keep additional copies.

@danielgordon10
Copy link
Author

They do keep additional copies. This is fixing that.

@jeffdonahue
Copy link
Contributor

This should have been done in #2836; is there something this does that that didn't do?

@danielgordon10
Copy link
Author

I didn't realize HDF5 was the new default. This doesn't do anything more, but it does equivalently what's done for the HDF5s.

@jeffdonahue
Copy link
Contributor

Ah, sorry, I didn't realize #2836 didn't make the analogous change for proto snapshots.

I'm not sure about adding an is_owner argument to the Blob::ToProto method -- the concept of owned vs. shared blobs is a property of the Net, so is_owner seems overly specific for the Blob class. If anything, I think the argument should just be write_data (since there's already a write_diff). But I'd prefer we not add an argument and just leave the logic in Net, changing Net::ToProto to work the same way as Net::ToHDF5. (Ideally more of the common logic would be factored out from both methods.)

@danielgordon10
Copy link
Author

Those are good comments, but I'm not sure I agree. For one, I think is_owner is more a property of the layer than of the net itself. I would agree that the blobs should not be responsible for the property. However, if we rename is_owner to write_data, it's pretty obvious to me that write_data and write_diff are nearly identical in form and function. Thus, I think they should be treated equivalently.

Secondly, each object should be responsible for the logic of serializing itself. To me that means layer should understand the concept of ownership, and should relay that information to the blobs. Pulling the logic into net would be a mistake that would cause the layer to not be able to encapsulate all of its desired functionality.

If anything, I would say it would be more clear to remove the param_owners_ property from net and to put it into each layer individually.

@longjon
Copy link
Contributor

longjon commented Aug 21, 2015

Thanks for opening this PR; I've just run into the same issue.

I do agree with @jeffdonahue here, in particular:

  • It's a bit odd to have a function with an option to (more or less) do nothing; is there in fact a case where we should write diff but not data?
  • Layers are functions, params are free variables over which optimization is performed. Having layers own params is probably a mistake in the first place; see Treat bottoms and params more uniformly? #1474. Note, e.g., that the Solver updates params directly without going through any Layer calls. Layers probably should not be thought of as "owning" their params, class membership notwithstanding.
  • One way or another, the ToProto code must work in an analogous way to the ToHDF5 code.

Also, there seems to be a long irrelevant commit history here... please do squash your changes into a single commit (which should be fine here, although a few logically separated commits is also fine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants