Add support for tf.nn.depth_to_space lambda #492

CNugteren · 2020-05-17T16:00:52Z

Added support for tf.nn.depth_to_space as a Keras lambda, mapped to the ONNX DepthToSpace operator.

The two added test-cases visualised with Netron:

Channels first (NCHW), channel dimension reduced from 4 to 4/(2*2) = 1, width & height dimension increased by a factor 2:

Channels last (NHWC), channel dimension reduced from 8 to 8/(2*2) = 2, width & height dimension increased by a factor 2:

Although the visualisations look correct as far as I can see, the NHWC test-case fails with the following error in the ORT:

onnxruntime.capi.onnxruntime_pybind11_state.Fail:
[ONNXRuntimeError] : 1 : FAIL : Node:lambda/DepthToSpace Output:lambda
[ShapeInferenceError] Can't merge shape info.
Both source and target dimension have values but they differ.
Source=1 Target=8 Dimension=1

This is my first contribution to keras-onnx (and first ONNX contribution in general), so please review carefully. And I'm also asking for help w.r.t. the second test-case. Or is that a bug in ORT?

CLAassistant · 2020-05-17T16:00:58Z

All committers have signed the CLA.

jiafatom · 2020-05-17T18:39:18Z

Thanks for your contribution. There is test failure for your case, dimension mismatch -- This is because the spec of tensorflow DepthToSpace and onnx DepthToSpace may be different. It is not a simple plug-in, you need understand both tf and onnx behavior, and use onnx ops to construct the tf op correctly, so may need some additional manipulation such as Transpose, etc.

CNugteren · 2020-05-17T18:41:07Z

Thanks @jiafatom. Yes, of course I noticed the newly introduced failed test, see also my message :-) I'll dig deeper and see if I can figure out what needs to be done, thanks.

(sorry for closing/re-opening, pressed the wrong button)

jiafatom · 2020-05-17T18:41:37Z

Thanks @jiafatom. Yes, of course I noticed the newly introduced failed test, see also my message :-)

There is an example to use onnx DepthToSpace here, you can see that we add a few ops to finish the conversion.

CNugteren · 2020-05-18T18:39:02Z

Tensorflow documentation for NHWC case (the failing one):

It is useful to consider the operation as transforming a 6-D Tensor. e.g. for data_format = NHWC, Each element in the input tensor can be specified via 6 coordinates, ordered by decreasing memory layout significance as: n,iY,iX,bY,bX,oC (where n=batch index, iX, iY means X or Y coordinates within the input image, bX, bY means coordinates within the output block, oC means output channels). The output would be the input transposed to the following layout: n,iY,bY,iX,bX,oC

If we number n,iY,iX,bY,bX,oC as n=0, iY=1, iX=2, bY=3, bX=4, oC=5, we then get n,iY,bY,iX,bX,oC as 0,1,3,2,4,5 then that indeed doesn't seem to match the documentation of ONNX for the CRD case:

b, c, h, w = x.shape
tmp = np.reshape(x, [b, c // (blocksize ** 2), blocksize, blocksize, h, w])
tmp = np.transpose(tmp, [0, 1, 4, 2, 5, 3])
y = np.reshape(tmp, [b, c // (blocksize ** 2), h * blocksize, w * blocksize])

It does after inserting a transpose [0, 3, 1, 2] at the start and [0, 2, 3, 1] at the end. But if we would implement that, why use the ONNX DepthToSpace operator at all, since essentially it is just two reshapes and a single transpose?

So I've just implemented directly Tensorflow's documentation using 2 reshapes and a transpose for the NHWC case. For the NCHW case the test seems to pass already, so the ONNX DCR case seems to match Tensorflow's NCHW implementation.

CNugteren · 2020-05-18T20:26:49Z

I've 'fixed' some tests by skipping old TensorFlows and old opsets, but I still see these kind of failures in the CI builds:

E   tensorflow.python.framework.errors_impl.InvalidArgumentError:  Only NHWC data_format supported on CPU. Got NCHW
E   	 [[node lambda_1/DepthToSpace (defined at c:\miniconda\envs\py3.7\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_keras_scratch_graph_1818]
E

I don't have this issue on my test machine (also CPU only). Did you have similar issues in other tests and if so how do you normally circumvent them?

jiafatom · 2020-05-18T22:21:33Z

I've 'fixed' some tests by skipping old TensorFlows and old opsets, but I still see these kind of failures in the CI builds:
E   tensorflow.python.framework.errors_impl.InvalidArgumentError:  Only NHWC data_format supported on CPU. Got NCHW
E   	 [[node lambda_1/DepthToSpace (defined at c:\miniconda\envs\py3.7\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_keras_scratch_graph_1818]
E   
I don't have this issue on my test machine (also CPU only). Did you have similar issues in other tests and if so how do you normally circumvent them?

The failed CI build is on linux-conda_ci and win32-conda-ci, which uses standalone keras and tf backend (not tf.keras). So if you set TF_KERAS=0 at your local machine, can you reproduce this issue?

keras2onnx/_builtin.py

CNugteren · 2020-05-19T10:15:55Z

The failed CI build is on linux-conda_ci and win32-conda-ci, which uses standalone keras and tf backend (not tf.keras). So if you set TF_KERAS=0 at your local machine, can you reproduce this issue?

No, it turned out to be that older versions of TF (pre 2.1.0) don't support NCHW mode for this op on the CPU. So I've added another skip in the tests.

CNugteren · 2020-05-19T11:02:41Z

keras2onnx/_builtin.py

+    block_size = node.get_attr('block_size')
+    oopb = OnnxOperatorBuilder(container, scope)
+    if _is_nhwc(node):
+        _, h, w, c = _cal_tensor_shape(node.inputs[0])


I still need some help, because here I have now set n = -1 such that it is OK if the batch dimension is unknown. But what if h and w are also unknown, i.e. their values are None. How to then do the reshapes below?

Good point here. This line has issue when the tensor shape is unknown (dynamic). Search our code base to find the case when _cal_tensor_shape is None. We need handle that -- Add a Shape op after node.inputs[0] to get the dynamic shape, and use Slice op to get h and w, and concatenate with other dimensions to make the desired_shape. We have some examples in our code.

I had a look at an example and it is not trivial, code quickly becomes unreadable because of all the extra nodes added. Then in that case I reconsidered the use of the actual DepthToSpace node, so I worked on that a bit and made it work with two extra transposes. I've added a test case with unknown tensor sizes as well and that one now also passes :-) Could you have another look at the code? Thanks!

…c shapes

jiafatom

lgtm, thanks for your contribution!

CNugteren added 2 commits May 17, 2020 17:49

Added test for the tf.nn.depth_to_space lambda

3289d66

Added support for the tf.nn.depth_to_space operator as ONNX DepthToSpace

5414adf

CNugteren closed this May 17, 2020

CNugteren reopened this May 17, 2020

Fixed implementation for NHWC case

c6bcb14

CNugteren added 2 commits May 18, 2020 20:55

Added a skip for older TensorFlow versions

04cb859

Added a skip for older Opset versions

ad0eed4

jiafatom reviewed May 18, 2020

View reviewed changes

keras2onnx/_builtin.py Outdated Show resolved Hide resolved

keras2onnx/_builtin.py Outdated Show resolved Hide resolved

CNugteren added 2 commits May 19, 2020 12:14

Addresses review comments: set op_version explicitly

dc5b397

Added skip fro NCHW for older TensorFlows than 2.1.0

981561a

CNugteren commented May 19, 2020

View reviewed changes

CNugteren added 2 commits May 19, 2020 20:47

Added a test case for unknown input shapes

877cd75

Reimplemented NHWC mode using DepthToSpace operator to support dynami…

2631b37

…c shapes

jiafatom approved these changes May 19, 2020

View reviewed changes

jiafatom merged commit 717ea09 into onnx:master May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for tf.nn.depth_to_space lambda #492

Add support for tf.nn.depth_to_space lambda #492

CNugteren commented May 17, 2020

CLAassistant commented May 17, 2020 •

edited

Loading

jiafatom commented May 17, 2020

CNugteren commented May 17, 2020 •

edited

Loading

jiafatom commented May 17, 2020

CNugteren commented May 18, 2020 •

edited

Loading

CNugteren commented May 18, 2020 •

edited

Loading

jiafatom commented May 18, 2020

CNugteren commented May 19, 2020

CNugteren May 19, 2020

jiafatom May 19, 2020

CNugteren May 19, 2020 •

edited

Loading

jiafatom left a comment

Add support for tf.nn.depth_to_space lambda #492

Add support for tf.nn.depth_to_space lambda #492

Conversation

CNugteren commented May 17, 2020

CLAassistant commented May 17, 2020 • edited Loading

jiafatom commented May 17, 2020

CNugteren commented May 17, 2020 • edited Loading

jiafatom commented May 17, 2020

CNugteren commented May 18, 2020 • edited Loading

CNugteren commented May 18, 2020 • edited Loading

jiafatom commented May 18, 2020

CNugteren commented May 19, 2020

CNugteren May 19, 2020

Choose a reason for hiding this comment

jiafatom May 19, 2020

Choose a reason for hiding this comment

CNugteren May 19, 2020 • edited Loading

Choose a reason for hiding this comment

jiafatom left a comment

Choose a reason for hiding this comment

CLAassistant commented May 17, 2020 •

edited

Loading

CNugteren commented May 17, 2020 •

edited

Loading

CNugteren commented May 18, 2020 •

edited

Loading

CNugteren commented May 18, 2020 •

edited

Loading

CNugteren May 19, 2020 •

edited

Loading