-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On-the-fly net resizing, without reallocation (where possible) #594
Conversation
Does this mean to partially resolve #557? The reshapable net still requires the images of a batch to be of the same size. To be more general, the involved blobs have to reshape themselves for each image on demand. Therefore the reshaping should not be initiated by the Net or the Layer. |
@kloudkl, I believe this is orthogonal to both #108 and #557. Let me clarify. #108 is a particular kind of layer, where reshaping is applied to data during a forward pass; this is about reshaping between passes, applied to entire nets. #557, if I understand correctly, is about data layers that read variously sizes images from disk, but still produce fixed-sized top blobs. This patch does not address that issue, and actually cannot be used with data layers. It's really meant for data supplied through the Python or matlab wrapper (or custom C++ code). One might imagine a data layer that produces variously sized top blobs, fed into a network with blobs without definite sizes. That should be a straightforward extension of this patch: just call I won't do that right away though. |
@longjon thanks it is a nice PR, you even clean up quite a bit the code, specially all the temporary data :) I will check it with the Matlab wrapper. #557 it is only meant to allow different size images as inputs, but data layers will still produce fixed size blobs. So it is complementary to this PR @kloudkl #108 is a layer, and therefore don't change the size dynamically, as this PR does. |
@@ -272,6 +272,11 @@ struct CaffeNet { | |||
return output_blob_names; | |||
} | |||
|
|||
void reshape(int num, int channels, int height, int width) { | |||
net_->input_blobs()[0]->Reshape(num, channels, height, width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is more than one input_blob
how could be resized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Python wrapper, I just implemented the common case where one wants to resize the first input blob (usually there is only one input blob, or the second blob is a fixed-size label). If people feel it should be included, I can implement the general case, or it can be done in a future PR.
@longjon nice job on a long-wished for feature. I'm a little uncomfortable about the restriction to a single (resizable) input blob. I'd vote for the general case now instead of later. Since Caffe does DAGs I'd vote for this too as well and not have some state where some features are only supported for certain classes of models. |
(@shelhamer) Here's my plan for this PR:
I expect/hope all that will happen by the end of next week. [ha!] |
4278286
to
c01f07a
Compare
@longjon please resurrect and complete this as outlined in #594 (comment). This'll settle plenty of workarounds with varying inputs, aspect ratio, and sequences. |
@shelhamer, yes, I'll rebase this soon and add the promised features, which I already use quite a bit. There is one design issue that has kept me from hastily pushing this forward, but I'm ready with a proposed solution, coming soon. |
bfead64
to
4c259d0
Compare
Rebased! As mentioned above, I've also switched to an implementation where Reshape is supported for all layers. (Of course, layers with parameters will error out if you try to reshape to a size incompatible with their parameters.) The layer interface is changed slightly. To make use of net reshaping:
It is up to the user to not Reshaping can be done in pycaffe with the same interface as in C++; just call This is ready for (re-)review. |
4c259d0
to
ec8a49c
Compare
This is in keeping with BVLC#742.
This allows nets to be reshaped very quickly (essentially for free) as long as sufficient memory has been allocated. Calling Blob::Reshape in order to free up memory becomes impossible; however, this is not a normal use case (and deleting blobs does free memory).
Note that calling Reshape when no reshape is necessary should be effectively a no-op, so this is not a performance regression.
This will make it possible to add reshaping to cuDNN layers.
Thanks Jon! This is not only a long-awaited improvement but a model PR with orderly and clear description and history. |
On-the-fly net resizing, without reallocation (where possible)
It seems that this PR does not support matlab wrapper, am I right?@longjon |
As far as I know, support for manual reshaping of the input has not been added to the matlab wrapper, though it might or might not work fine with layers that produce tops of various sizes. @sguada, do you know the full story? |
On-the-fly net resizing, without reallocation (where possible)
This PR allows nets to change their input sizes in-place, reusing allocated memory for blobs and buffers. This allows, for example:
Net
gets a new methodReshape
that provides this functionality. One first resizes input blobs manually, then callsNet::Reshape
with no arguments. (This has the unfortunate property that one has to momentarily "break" the net before callingReshape
, but it avoids the awkwardness of typingReshape
to accept a vector of 4-tuples.)Net::Reshape
just callsLayer::Reshape
for each layer in turn, bottom-up. Since reshaping doesn't make sense for all layers, and layers may constrain acceptable new sizes,Layer::Reshape
is opt-in; only layers that implement it can be reshaped. (Neuron layers can all be reshaped, most of them in a trivial way, so an implementation ofNeuronLayer::Reshape
is provided.)Note that reshaping is intended only for cases where the existing parameters can continue to be used with modification. Reshaping is not intended for use with data layers. This PR provides reshapability for essentially only the layers needed for a Krizhevsky-style net.
Many layers use internal buffers, which are sometimes implemented as
Blob
s, sometimesshared_ptr<Blob>
s, and sometimesSyncedMemory
s. This PR uniformizes some of these to be justBlob
s in order to simplify implementation. As far as I can tell, noshared_ptr
s were removed that needed to beshared_ptr
s; someone let me know if this is not true (@jeffdonahue?)This PR includes a simple implementation of part of #355 in 33959e5f9a4c4342ee797fca71d607d74f792483. Unlike #355, the implementation is entirely within
Blob::Reshape
, and does not touchSyncedMemory
. The disadvantage is that it doesn't callrealloc
when enlarging blobs; this could be added in a later patch if desired. (This PR does not address sharing blobs between train and test nets.)Although I haven't used it a lot yet, this PR is fully-baked: tests are included, it's usable from Python, it builds everything with
-Wall -Werror
, and passes tests and lint. There is a clean, linear history; if the reader feels overwhelmed by the changes, I suggest reading it one commit at a time (in commit order, not github order).