Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: mat1 and mat2 shapes cannot be multiplied when calling get_potential_energy() #728

Open
GUANGZChen opened this issue Dec 2, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@GUANGZChen
Copy link

Description

I encountered a RuntimeError while attempting to calculate the potential energy of an atomic system using the get_potential_energy() function in ASE with a MACE-based calculator. The error appears to stem from a matrix multiplication shape mismatch in mace.modules.blocks.
This potential was specifically trained for a CPU system. Notably, I previously trained a similar MACE potential using the same configuration, and it worked without issues. The only difference in this case is the inclusion of a larger dataset during training.

Error Traceback
plaintext

RuntimeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_5224\1801253464.py in
----> 1 a.get_potential_energy()

D:\anaconda3\lib\site-packages\ase\atoms.py in get_potential_energy(self, force_consistent, apply_constraint)
753 self, force_consistent=force_consistent)
754 else:
--> 755 energy = self._calc.get_potential_energy(self)
756 if apply_constraint:
757 for constraint in self.constraints:

D:\anaconda3\lib\site-packages\ase\calculators\abc.py in get_potential_energy(self, atoms, force_consistent)
22 else:
23 name = 'energy'
---> 24 return self.get_property(name, atoms)
25
26 def get_potential_energies(self, atoms=None):

D:\anaconda3\lib\site-packages\ase\calculators\calculator.py in get_property(self, name, atoms, allow_calculation)
536 self.atoms = atoms.copy()
537
--> 538 self.calculate(atoms, [name], system_changes)
539
540 if name not in self.results:

D:\anaconda3\lib\site-packages\mace\calculators\mace.py in calculate(self, atoms, properties, system_changes)
232 if self.model_type in ["MACE", "EnergyDipoleMACE"]:
233 batch = self._clone_batch(batch_base)
--> 234 node_e0 = self.models[0].atomic_energies_fn(batch["node_attrs"])
235 compute_stress = not self.use_compile
236 else:

D:\anaconda3\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs)
1551 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1552 else:
-> 1553 return self._call_impl(*args, **kwargs)
1554
1555 def _call_impl(self, *args, **kwargs):

D:\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs)
1560 or _global_backward_pre_hooks or _global_backward_hooks
1561 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1562 return forward_call(*args, **kwargs)
1563
1564 try:

D:\anaconda3\lib\site-packages\mace\modules\blocks.py in forward(self, x)
144 self, x: torch.Tensor # one-hot of elements [..., n_elements]
145 ) -> torch.Tensor: # [..., ]
--> 146 return torch.matmul(x, self.atomic_energies)
147
148 def repr(self):

RuntimeError: mat1 and mat2 shapes cannot be multiplied (215x2 and 1x2)
Steps to Reproduce
Use a MACE-based calculator in ASE.
Define an atomic structure (ase.Atoms).
Attempt to compute the potential energy using get_potential_energy().
Expected Behavior
The function should compute the potential energy of the system without error.

Actual Behavior
A RuntimeError occurs, indicating a shape mismatch during a matrix multiplication in mace.modules.blocks.

It appears that the issue arises in the forward method of mace.modules.blocks, where torch.matmul is called on tensors with incompatible shapes (215x2 and 1x2).

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 3, 2024

Can you make sure you are using the same mace vesion to train and evaluate the models.
Please share you MACE version and Pytorch version.

@GUANGZChen
Copy link
Author

Can you make sure you are using the same mace vesion to train and evaluate the models. Please share you MACE version and Pytorch version.

Hi, thanks for your reply. I can confirm I am using the same mace version which is 0.3.7 and the pytorch version is 2.2.1

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 3, 2024

Can you please try updating to the latest version of MACE?

@GUANGZChen
Copy link
Author

Can you please try updating to the latest version of MACE?

Thanks I will have a try with the latest version.

@GUANGZChen
Copy link
Author

Can you please try updating to the latest version of MACE?
Hi, I have trained a new potential with the latest version and it still have the same error:'RuntimeError: mat1 and mat2 shapes cannot be multiplied (215x2 and 1x2)'

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 4, 2024

Or you using the model file without "compiled" in the file name ?

@GUANGZChen
Copy link
Author

Or you using the model file without "compiled" in the file name ?

I tried both, for the compiled one, the error is: 'RuntimeError: expand(torch.FloatTensor{[215, 215]}, size=[215]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)'

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 4, 2024

Can you send me your model at ib467@cam.ac.uk?

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 4, 2024

I have mace 0.3.9 installed locally and the model works fine. Can you double check that you have the right mace version locally (please do pip uninstall mace-torch).

@crispppp
Copy link

crispppp commented Dec 5, 2024

Hello, I also encountered this problem recently. I first evaluated (by mace 0.3.9) a model (trained by a previous mace version) with cueq, the error about matrix shape occured. Then I train a new model with 0.3.9 and evaluate again (both with cueq), the same error occurs as:
File "/opt/conda/lib/python3.11/site-packages/ase/atoms.py", line 777, in get_potential_energies
return self._calc.get_potential_energies(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/ase/calculators/abc.py", line 27, in get_potential_energies
return self.get_property('energies', atoms)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/ase/calculators/calculator.py", line 538, in get_property
self.calculate(atoms, [name], system_changes)
File "/opt/conda/lib/python3.11/site-packages/mace/calculators/mace.py", line 299, in calculate
out = model(
^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/mace/modules/models.py", line 382, in forward
node_e0 = self.atomic_energies_fn(data["node_attrs"])[
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/mace/modules/blocks.py", line 197, in forward
return torch.matmul(x, torch.atleast_2d(self.atomic_energies).T)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (144x85 and 84x2)

@crispppp
Copy link

crispppp commented Dec 5, 2024

After several tests, the problem seems to only appear when running committee MACE using version 0.3.9, where a list of more than 2 models is passed to MACEcalculaters, no matter these models are trained on the old version or latest. (exactly 2 models will work well sometime)

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 5, 2024

mmm very weird, I don't think your two errors are the same though. Can you please send the scripts and models to reproduce your error on my email: ib467@cam.ac.uk

@GUANGZChen
Copy link
Author

I have mace 0.3.9 installed locally and the model works fine. Can you double check that you have the right mace version locally (please do pip uninstall mace-torch).

Hi, Thanks for your testing. I just did a quick check and it seems the latest version is 0.3.8 and I am not able to install 0.3.9.

@ilyes319
Copy link
Contributor

ilyes319 commented Dec 9, 2024

Can you try again, just do pip install mace-torch and check it upgraded to 0.3.9

@ilyes319 ilyes319 added the bug Something isn't working label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants