Error Loading Embeddings in Different Environment from Training #103

tbouchik · 2023-08-21T18:55:06Z

Describe the bug
When I train my AE on a training server and then load it on a production server, I encounter an error while trying to use the embed function. However, the same function works without issues on the training server.

To Reproduce
Steps to reproduce the behavior:

Train the AE on the training server.
Load the trained model on the production server.
Execute autoencoder_model.embed(someTensorX).

See error
`Traceback (most recent call last):
File "", line 1, in

File "/usr/local/lib/python3.10/dist-packages/pythae/models/base/base_model.py", line 129, in embed
return self(DatasetOutput(data=inputs)).z

File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)

File "/usr/local/lib/python3.10/dist-packages/pythae/models/ae/ae_model.py", line 76, in forward
z = self.encoder(x).embedding

File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)

File "LOCAL_PATH_OF_TRAIN_SERVER_PYTHON_FILE_LOADING_PYTHAE_MODEL", line 40, in forward
TypeError: 'c' not supported between instances of 'NoneType' and 'NoneType'`

Expected behavior
I expect the model to embed the tensor without any errors, irrespective of the server it's being executed on.

Desktop Prod Server:
OS version:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
Kernel version:
5.15.0-76-generic

Desktop Train Server:
OS version:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS (beaver-osp1-gendry X45)
Release: 18.04
Codename: bionic
Kernel version:
5.4.0-150-generic

Additional context
The error seems to be related to the path of the Python file on the training server, as indicated in the traceback. It appears that the training environment's path is somehow hardcoded into the model when it's saved, which might be causing the issue when trying to load the model in a different environment.

clementchadebec · 2023-08-25T09:45:18Z

Hi @tbouchik,

Thanks for mentioning this issue. It is a weird bug. Can you share your python environments on the training server and the production one (pip freeze) ? In particular, do you have the same version of Python on both servers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Loading Embeddings in Different Environment from Training #103

Error Loading Embeddings in Different Environment from Training #103

tbouchik commented Aug 21, 2023 •

edited

Loading

clementchadebec commented Aug 25, 2023 •

edited

Loading

Error Loading Embeddings in Different Environment from Training #103

Error Loading Embeddings in Different Environment from Training #103

Comments

tbouchik commented Aug 21, 2023 • edited Loading

clementchadebec commented Aug 25, 2023 • edited Loading

tbouchik commented Aug 21, 2023 •

edited

Loading

clementchadebec commented Aug 25, 2023 •

edited

Loading