-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add resnet50 example #3266
add resnet50 example #3266
Conversation
Now sure if that's still active, but #2793 is on Resnet as well. |
@giorgiop yes I've check the issue you mentioned, but I think these 2 scripts are not same. My commit is exactly what Kaiming He published in his github, acturally I have converted the pretrained caffemodel provided by Kaiming He to Keras h5 file. Once I finished my tests, I'll update this scripts so that people could get pre-trained resnet50 model directly from keras examples. |
This script get 10 points in PEP8 tests on my computer but.....
In that case I think you should provide the Keras weights, and in your script demonstrate how to load the pre-trained weights and run inference on some images. It would be a great addition. |
One thing to consider would be to provide two version of the weights file: one for Theano and one for TensorFlow, since they differ: https://github.com/fchollet/keras/wiki/Converting-convolution-kernels-from-Theano-to-TensorFlow-and-vice-versa |
@fchollet Ok, I'll do it as soon as possible~ |
@fchollet Hey I have finished my test on this script, it works well, this is a screen shot from my IPython: In the comments at the top of this script, I release the address where people can download pretrianed h5 file. This weight file is only for tensorflow backend for now. I tried to convert it to theano backend version but failed. (The weights could be loaded but the test result is incorrect). Maybe you can just merge this PR and I'll keep trying. And, since gist is blocked by Great Wall in China, the converted weights were uploaded on "Baidu" cloud drive. Maybe someone could donwload from baidu cloud drive and then upload it to gist. I think more or little, the Chinese words on baidu cloud drive is annoying to those tho speak English. Thank you! I'm goint to fix the endless PEP8 problems... |
Can you clarify what you did and what went wrong? The only difference between Theano and TensorFlow is the fact that TensorFlow uses flipped kernels (because it does correlation, not convolution) in |
Btw the link you provide does not work. 啊哦,你所访问的页面不存在了。 |
@fchollet Good afternoon, here's how I transfer tf weights to th weights: from keras import backend as K
from keras.utils.np_utils import convert_kernel
import h5py
f_th = h5py.File('thresnet50.h5','w')
f_tf = h5py.File('resnet50.h5','r')
for k in f_tf.keys():
grp = f_th.create_group(k) # create group fpr each layer
if k[:3]=='res' or k[:4]=='conv': #which means it is a conv layer
grp.create_dataset('weights',data=convert_kernel(f_tf[k]['weights'][:])) # for conv layer, call convert_kernel to transfer weight into th
else:
grp.create_dataset('weights',data=f_tf[k]['weights'][:]) # else just keep it still
grp.create_dataset('bias',data=f_tf[k]['bias'][:]) # store the bias term
f_th.close()
f_tf.close() Basically th weights is just a copy of tf weights, with the only exception that for conv layer, the weights is converted by convert_kernel. After this transformation, I swich backend to Theano and load this transfered weights. But the prediction result is incorrect, both two test images were predicted to be "n02443485 black-footed ferret, ferret, musterla nigrips" I have to admit that I didn't spend much time on transfering the weights, maybe I should be more careful. Perhaps there will be some good news when you wake up tomorrow. BTW, I can visit the link and download weights normally. If it doesn't work for you, I'll try to upload it to elsewhere. Microsoft OneDrive could be a good choice. Thank you~ |
Hi @fchollet , a good news and a bad one. The bad news is that I use the code in https://github.com/fchollet/keras/wiki/Converting-convolution-kernels-from-Theano-to-TensorFlow-and-vice-versa , and then use save_weights like this: from keras import backend as K
from keras.utils.np_utils import convert_kernel
import res_net50
import h5py
model = res_net50.get_resnet50()
model.load_weights('tf_resnet50.h5')
for layer in model.layers:
if layer.__class__.__name__ in ['Convolution1D', 'Convolution2D']:
original_w = K.get_value(layer.W)
converted_w = convert_kernel(original_w)
K.set_value(layer.W, converted_w)
model.save_weights('th_resnet50.h5') This is the easiest solution I can figure out, but after run this script, switch backend to theano and load 'th_resnet50.h5' we just generated, the test result is still not correct. ('n02443485 black-footed ferret, ferret, musterla nigrips' for both test images) Perhaps the difference between th and tf is bigger than we expected. And I think this difference would be the souce of many unexpected bugs. I've updated the links. Now you can download the weights from Google drive. |
@@ -0,0 +1,218 @@ | |||
'''This script demonstrates how to build the resnet50 architecture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File should be renamed to resnet_50
General issues with the PR:
Specifically, for the docstring:
|
Also it would be best to understand why weights are not convertible. Every operation is unit-tested to yield the same result in both Theano and TensorFlow (see backend tests), modulo the weight conversion operation. It should be impossible for a combination of identical operations to yield different results. Most likely an issue with your conversion code. |
@fchollet Got it, I'll fix the problems you mentioned soon. BTW Google/Facebook/Twitter and a lot of other websites are not accessible in China because they were blocked by the hateful "Great Firewall". I know it is crazy but it did happen. |
@MoyanZitto cool, thank you. Ideally we'd have a way to host model files that isn't Google drive or Baidu. Maybe AWS S3. |
@fchollet Fix some repos, not for sure whether there are still grammar mistakes in the scripts (really sorry for my limited English). If it not too much trouble, you may modify this script as you like. I noticed that conv layers get the defualt dim_ordering by K.image_dim_ordering(), so I simply use K.set_image_dim_ordering('tf') to change the dim order, is it works? Although we can visit AWS S3 in China, the speed is very slow... so it's better to retain the Baidu drive link. |
return out | ||
|
||
|
||
def conv_block(input_tensor, nb_filter, stage, block, kernel_size=3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to pass "stage" and "block". They are not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather see conv_block(input_tensor, kernel_size, filters, stride=2)
That's not enough. Image dim ordering is hard-coded in several places of your code, such as when you do merges or when you load input data. |
@fchollet Thank you very much for pointing out these mistakes! You are so kind to do so. These (ugly) names come from Kaiming He's caffe model, see http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006 BTW, I don't see anything about dim_ordering in 'merge'. Could you make it more clear? |
@fchollet And, users can set dim_ordering now. I offer both 'tf' dim_ordering weights for acceleration and 'th' dim_odering weights for compatibility (if they want to use this script and their own "th" dim_ordering code jointly). The links are given at the top of the code. I think "dim_ordering" is just how the input image been organized, it should be nothing to do with the shape of weights of conv layers. Perhaps we should cut off the dependency between input dim_ordering and the shape of conv layers. In this case a single version of weights could be loaded in a model no matter what the image dim_ordering is. Hope this script get merged soon~~~it feels really good to be a keras contributor! |
Not quite true. Kernels have to be transposed. Also the output of the The reason why your code appears to run properly is actually that you are setting the dim ordering via So it appears to me that your support of tf dim ordering isn't correct. For the sake of merging your PR quickly, let's give it up. Please only support th dim ordering (i.e. what you were doing initially). I'll add tf support later on myself, which will involve converting the weights and isn't quite easy. Otherwise, the code does looks better now, congrats. |
Never mind my previous post, it seems I misread your code. Let me check it out again. |
Ok, LGTM. Thanks for the valuable contribution! |
This is a Keras implmentation of Kaiming He's residual network (50 layers).
The layers have been properly named so that it would be easy if anyone want to load the pretrained weights converted from Kaiming he's caffemodel file.