-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve and polish pycaffe #816
Improve and polish pycaffe #816
Conversation
@longjon please take a look. In 3cac223 I added the preprocessing option dicts as members on the C++ side–let me know what you think. |
@longjon a97a41b settles #525:
|
I want to break the interface by changing |
Yes please. That interface in particular seems okay to break because it should go away when we finally have unified input preprocessing. (I'll have a closer look at the rest of this soon.) |
@longjon please review and merge. Further input preprocessing optimization can be a follow-up. I'd like to have the preprocessing fix in and release the fix to master soon since it's not the nicest bug. |
Note the build is fine -- Travis just times out downloading CUDA. |
else: | ||
# ndimage interpolates anything but more slowly. | ||
scale = tuple(np.array(new_dims) | ||
/ np.array(im.shape[:2], dtype=np.float32)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what dtype=np.float32
is doing here... won't it just be promoted to float64
anyway? E.g.,
$ ipython --no-banner
In [1]: import numpy as np
In [2]: (np.array(5) / np.array(4, dtype=np.float32)).dtype
Out[2]: dtype('float64')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This doesn't belong and I'll drop it.
On Monday, August 4, 2014, longjon notifications@github.com wrote:
In python/caffe/io.py:
@@ -40,7 +43,18 @@ def resize_image(im, new_dims, interp_order=1):
Give
im: resized ndarray with shape (new_dims[0], new_dims[1], K)
"""
- return skimage.transform.resize(im, new_dims, order=interp_order)
- if im.shape[-1] == 1 or im.shape[-1] == 3:
# skimage is fast but only understands {1,3} channel images in [0, 1].
im_min, im_max = im.min(), im.max()
im_std = (im - im_min) / (im_max - im_min)
resized_std = resize(im_std, new_dims, order=interp_order)
resized_im = resized_std \* (im_max - im_min) + im_min
- else:
# ndimage interpolates anything but more slowly.
scale = tuple(np.array(new_dims)
/ np.array(im.shape[:2], dtype=np.float32))
Not sure what dtype=np.float32 is doing here... won't it just be promoted
to float64 anyway? E.g.,$ ipython --no-banner
In [1]: import numpy as np
In [2]: (np.array(5) / np.array(4, dtype=np.float32)).dtypeOut[2]: dtype('float64')—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/pull/816/files#r15795016.
Looks good except as noted. There were a couple kinda awkward things I noticed that existed before this PR:
In addition, I'm not totally sure why we have These things said, I'm happy to merge any PR that is a strict improvement, and I'd rather merge something incremental sooner than make everything just right eventually. |
define `Net.{mean, input_scale, channel_swap}` on the boost::python side so that the members always exist. drop ugly initialization logic.
With the right input processing, the actual image classification output is sensible. - filter visualization example's top prediction is "tabby cat" - net surgery fully-convolutional output map is better Fix incorrect class names too.
Rebased to address comments and hold off on integrating @petewarden's changes -- they will be included in a follow-up once this fix is in. @longjon please take a last look and merge. |
- reorder channels (for instance color to BGR) | ||
- subtract mean | ||
- transpose dimensions to K x H x W |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should raw scale be noted here?
- load an image as [0,1] single / np.float32 according to Python convention - fix input scaling during preprocessing: - scale input for preprocessing by `raw_scale` e.g. to map an image to [0, 255] for the CaffeNet and AlexNet ImageNet models - scale feature space by `input_scale` after mean subtraction - switch examples to raw scale for ImageNet models - fix BVLC#525 - preserve type after resizing. - resize 1, 3, or K channel images with special casing between skimage.transform (1 and 3) and scipy.ndimage (K) for speed
…-examples Improve and polish pycaffe
…ttrs-examples Improve and polish pycaffe
…ttrs-examples Improve and polish pycaffe
add console output with human-readable labels and a grayscale flag to classify.py courtesy of @petewarden's Add MNIST support to classify.py script #735to be continued...Net.raw_scale
, doing mean subtraction, and then input scaling