Returning non-tensors from keras.layer.Layer #516

DavidLandup0 · 2023-07-17T12:30:46Z

In Keras 2.0, it was allowed to return tensors and non-tensors from a keras.layers.Layer.

Something like:

    def call(self, x):
        x = self.proj(x)
        shape = x.shape
        x = keras.ops.reshape(x, (-1, shape[1] * shape[2], shape[3]))
        x = self.norm(x)
        return x, shape[1], shape[2] # return tensor, original H and W

In the new API, this throws an error:

All operation outputs must be tensors. Operation <LayerName name=layer_name, built=True> returned a non-tensor. Non-tensor received: 56

Attempted a naive fix:

return x, keras.ops.convert_to_tensor(shape[1]), keras.ops.convert_to_tensor(shape[2])

But this seems to "unbuild" the layer:

Layer 'layer_name' looks like it has unbuilt state, but Keras is not able to trace the layer `call()` in order to build it automatically. You must implement the `def build(self, input_shape)` method on your layer. It should create all variables used by the layer (e.g. by calling `layer.build()` on all its children layers).

Using keras.ops.shape falls back to the backend implementation of shape(). In this case, it's equivalent to tf.shape() and results in the same error.

Is there a supported way to return non-tensors beside tensors from a call() within a Layer? If not, how can tensors be returned in a multi-backend context?

Thanks!

The text was updated successfully, but these errors were encountered:

ianstenbit · 2023-07-17T16:02:14Z

Hey @DavidLandup0 can you post a full repro for this?

I am trying to reproduce like this

import keras_core as keras
import numpy as np

class OverlappingPatchingAndEmbedding(keras.layers.Layer):
    def __init__(self, out_channels=32, patch_size=7, stride=4, **kwargs):
        super().__init__(**kwargs)
        self.proj = keras.layers.Conv2D(
            filters=out_channels,
            kernel_size=patch_size,
            strides=stride,
            padding="same",
        )
        self.norm = keras.layers.LayerNormalization()

    def call(self, x):
        x = self.proj(x)
        # B, H, W, C
        shape = x.shape
        x = keras.ops.reshape(x, (-1, shape[1] * shape[2], shape[3]))
        x = self.norm(x)
        return x, shape[1], shape[2]

inputs = np.random.uniform(size=(1, 16, 16, 8))
model = keras.Sequential([OverlappingPatchingAndEmbedding()])
outputs = model(inputs)

print(inputs)
print(outputs)

And I'm not seeing any errors.

DavidLandup0 · 2023-07-17T16:11:06Z

That's odd. Here's a clean Google Colab with the exact same code: https://colab.research.google.com/drive/196JtnB1joCRNWyMbDIX4aXvf0_a7MhKI?usp=sharing

ianstenbit · 2023-07-17T16:15:05Z

That's odd. Here's a clean Google Colab with the exact same code: https://colab.research.google.com/drive/196JtnB1joCRNWyMbDIX4aXvf0_a7MhKI?usp=sharing

Ahh interesting -- looks like it's working with the jax backend but not the TF backend. I am looking into it to try to understand the root cause

DavidLandup0 · 2023-07-17T16:21:49Z

Thanks! It looks like there's a check that explicitly checks whether all returns are tensors and raises the error:

keras-core/keras_core/ops/node.py

Line 47 in f2c16f7

for x in self.outputs:

.

Without digging too much into it, it looks like a design choice, but I don't know if it was purposefully set 😅

ianstenbit · 2023-07-17T16:35:06Z

Without digging too much into it, it looks like a design choice

Indeed -- I think this was intentional. Maybe @fchollet you can shed some light on why this is the case?

In the meantime, we can pass the resulting shapes as symbolic tensors as long as we specify the output spec of our layer. With this definition of the layer, this is now working for me across all backends:

import keras_core as keras
import numpy as np
from keras_core import KerasTensor

class OverlappingPatchingAndEmbedding(keras.layers.Layer):
    def __init__(self, out_channels=32, patch_size=7, stride=4, **kwargs):
        super().__init__(**kwargs)
        self.proj = keras.layers.Conv2D(
            filters=out_channels,
            kernel_size=patch_size,
            strides=stride,
            padding="same",
        )
        self.norm = keras.layers.LayerNormalization()
        self.stride = stride

    def call(self, x):
        x = self.proj(x)
        # B, H, W, C
        shape = x.shape
        x = keras.ops.reshape(x, (-1, shape[1] * shape[2], shape[3]))
        x = self.norm(x)
        return x, keras.ops.convert_to_tensor(shape[1]), keras.ops.convert_to_tensor(shape[2])

    def compute_output_spec(self, inputs):
        input_shape = inputs.shape
        output_shape = (input_shape[0], (input_shape[1] // self.stride) * (input_shape[2] // self.stride), input_shape[3])
        output_spec = KerasTensor(output_shape, dtype=self.compute_dtype)
        return output_spec, KerasTensor([0]), KerasTensor([0])

fchollet · 2023-07-17T19:20:09Z

In all likelihood you don't actually need to do this and you can find an alternative solution. But if you really want to return shapes, then they must be tensors. The correct way to do this would be to use convert_to_tensor(ops.shape(x)). When doing this also make sure to implement compute_output_spec (like in Ian's example above) -- this is where the error you were seeing comes from.

ianstenbit · 2023-07-18T14:44:01Z

I think we can close this as WAI. my comment above shows how we can return constants from layers for use cases like this, but as a general rule this is an anti-pattern which should be avoided where possible.

DavidLandup0 · 2023-07-18T14:48:27Z

Thank you both for the replies! @fchollet @ianstenbit
I'll probably go with pre-computing the H,W values instead of returning from a layer. We probably won't need this feature - I just wasn't sure if it was intended or not initially, since it was allowed in Keras 2.0.

DavidLandup0 mentioned this issue Jul 17, 2023

[DeepVision Port] SegFormer and Mix-Transformers keras-team/keras-cv#1946

Merged

4 tasks

ianstenbit closed this as completed Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returning non-tensors from keras.layer.Layer #516

Returning non-tensors from keras.layer.Layer #516

DavidLandup0 commented Jul 17, 2023 •

edited

Loading

ianstenbit commented Jul 17, 2023

DavidLandup0 commented Jul 17, 2023

ianstenbit commented Jul 17, 2023

DavidLandup0 commented Jul 17, 2023

ianstenbit commented Jul 17, 2023

fchollet commented Jul 17, 2023

ianstenbit commented Jul 18, 2023

DavidLandup0 commented Jul 18, 2023

Returning non-tensors from keras.layer.Layer #516

Returning non-tensors from keras.layer.Layer #516

Comments

DavidLandup0 commented Jul 17, 2023 • edited Loading

ianstenbit commented Jul 17, 2023

DavidLandup0 commented Jul 17, 2023

ianstenbit commented Jul 17, 2023

DavidLandup0 commented Jul 17, 2023

ianstenbit commented Jul 17, 2023

fchollet commented Jul 17, 2023

ianstenbit commented Jul 18, 2023

DavidLandup0 commented Jul 18, 2023

DavidLandup0 commented Jul 17, 2023 •

edited

Loading