Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building the model with (None, None, 3) #11

Closed
sayakpaul opened this issue Oct 11, 2022 · 9 comments
Closed

Building the model with (None, None, 3) #11

sayakpaul opened this issue Oct 11, 2022 · 9 comments
Assignees

Comments

@sayakpaul
Copy link
Owner

The original MAXIM model can accept images of any resolution even though it was trained on 256x256x3 images.

But this doesn't constrain the MAXIM model to accept only 256x256x3 images. As long as the input image's spatial resolutions are divisible by 64, it's all good.

This is how the authors do it:

In our case, the model is built with layers.Input((256, 256, 3)):

inputs = keras.Input((input_resolution, input_resolution, 3))

If we use (None, None, 3), it throws:

Traceback (most recent call last):
  File "convert_to_tf.py", line 234, in <module>
    main(args)
  File "convert_to_tf.py", line 192, in main
    _, tf_model = port_jax_params(configs, args.ckpt_path)
  File "convert_to_tf.py", line 140, in port_jax_params
    tf_model = Model(**configs)
  File "/Users/sayakpaul/Downloads/maxim-tf/create_maxim_model.py", line 31, in Model
    outputs = maxim_model(inputs)
  File "/Users/sayakpaul/Downloads/maxim-tf/maxim/maxim.py", line 99, in apply
    height=h // (2 ** i),
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

From the logs, it might seem obvious that we cannot build the Keras model with (None, None, 3) since there are calculations inside the model that require us to specify the spatial dimensions.

Do you know of any way to mitigate this problem or any other approach?

@gustheman

@sayakpaul
Copy link
Owner Author

sayakpaul commented Oct 11, 2022

One solution might be to initialize the model every time it receives a new input with the spatial resolutions of the input and then load the weights and then run inference. But it's extremely inefficient.

I have added extensive comments in run_eval.py script to show how to do this.

@gustheman
Copy link
Collaborator

I've just tried the create_maxim_model on a new environment and I didn't get this error
can you give me some eval examples for me to test further?

@sayakpaul
Copy link
Owner Author

sayakpaul commented Oct 16, 2022

Did you try changing the resolution accepted by keras.Input to (None, None, 3)?

This line of code:

inputs = keras.Input((input_resolution, input_resolution, 3))

@gustheman
Copy link
Collaborator

yes, it works
'''
m3 = Model(variant='M-2')
'''

but when I define an input_resolution=512
Traceback (most recent call last):
File "", line 1, in
File "/home/jupyter/maxim-tf/create_maxim_model.py", line 33, in Model
inputs = keras.Input((*input_resolution, 3))
TypeError: 'int' object is not iterable

maybe I'm doing something wrong?

@gustheman
Copy link
Collaborator

I'll try more tomorrow, I'll ping you when I start

@sayakpaul
Copy link
Owner Author

Sure. Let me know what you encounter. Maybe attach a Jupyter Notebook?

@sayakpaul
Copy link
Owner Author

Hacked around this by introducing a dynamic_resize flag to run_eval.py.

@danwexler
Copy link

Is this solution ideal? What would it require to natively support any sized image, perhaps with an independent X & Y resolution that is a multiple of 64? Do we need to retrain and re-export the model with (None, None, 3)?

I'm keen to help make this work in TFJS, as long as it works on arbitrary sized images without a big performance or quality hit. I've got a 4090 that I can dedicate to re-training, if needed, and I'm reasonably competent with TF/TFJS for inference.

From the logs, it might seem obvious that we cannot build the Keras model with (None, None, 3) since there are calculations inside the model that require us to specify the spatial dimensions.

I've managed to adjust this sort of internal issue in the model before. I'll start poking around in the model code to see the resolution-dependent bits.

@sayakpaul
Copy link
Owner Author

Changes are being done here: #24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants