Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Water molecule coordinates are setting to (0,0,0) #427

Closed
ale94mleon opened this issue Aug 1, 2023 · 4 comments · Fixed by OpenBioSim/biosimspace#149 or OpenBioSim/biosimspace#154
Labels

Comments

@ale94mleon
Copy link

Describe the bug
I am trying to simulate a protein-ligand complex. For that I have to take into account crystal waters. However, when I try to combine the bss.Sysytem protein with the bss.System of the water the resulting gro file change the coordinates of the water molecules (test1). In addition, I see that the residue name in the topology also changes, from COF to SOL. Probably this issue is also related to #32. The curious thing is that if I create a combined configuration/topology (protein+water) and use the same procedure the coordinates are not lost (test2).

To Reproduce

import copy
import BioSimSpace as bss
from zipfile import ZipFile
import BioSimSpace as bss
import subprocess

print(bss.__version__)
# 2023.2.2

def run(command):
    process = subprocess.run(command, shell = True, executable = '/bin/bash', stdout=subprocess.PIPE, stderr=subprocess.PIPE, text = True)
    return process

with ZipFile('bug.zip', 'r') as zip:
    zip.extractall()

protein_non_water={
    'gro': 'bug/protein-amber14-all/protein_non_wt.gro',
    'top': 'bug/protein-amber14-all/protein_non_wt.top',}

protein_with_water={
    'gro': 'bug/protein-amber14-all-fussion/protein_with_wt.gro',
    'top': 'bug/protein-amber14-all-fussion/protein_with_wt.top',}

cofactor={
    'gro': 'inputs/crystal-water-tip3p/wt_cofactor.gro',
    'top': 'inputs/crystal-water-tip3p/wt_cofactor.top',

}
print("Initial:\n", run(f"grep COF {cofactor['gro']}").stdout)

prot_non_wt = bss.IO.readMolecules([protein_non_water['gro'],protein_non_water['top']])
crystal_water = bss.IO.readMolecules([cofactor['gro'],cofactor['top']])
all_system = copy.copy(prot_non_wt)
all_system += copy.copy(crystal_water)
bss.IO.saveMolecules('test1', all_system, ["GroTop", "Gro87"])
print("Protein + crystal waters:\n", run(f"grep COF test1.gro").stdout)

prot_with_wt = bss.IO.readMolecules([protein_with_water['gro'],protein_with_water['top']])
bss.IO.saveMolecules('test2', prot_with_wt, ["GroTop", "Gro87"])
print("Protein + crystal waters:\n", run(f"grep COF test2.gro").stdout)

Expected behavior
The coordinates do not change as well the residue name.

Input files
bug.zip

(please complete the following information):

  • OS: Linux (distribution)
    • cat /etc/*-release:
LSB_VERSION=1.4
DISTRIB_ID=Arch
DISTRIB_RELEASE=rolling
DISTRIB_DESCRIPTION="Arch Linux"
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://www.archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux
  • Version of Python: 3.9
  • Version of BioSimSpace: 2023.2.2
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: no
@ale94mleon ale94mleon added the bug label Aug 1, 2023
@lohedges
Copy link
Member

lohedges commented Aug 1, 2023

Hello there,

Thanks for reporting. Could you please repost this issue at the BioSimSpace repository under the OpenBioSim organisation here. This is the new location for development work. This repository only exists in a legacy capacity to support outstanding feature branches that have not yet been ported.

I'm not sure why the coordinates are being set to zero, but I imagine that the residue name is being changed to match the naming convention used for waters by GROMACS, which is SOL. (We generally need to swap water topologies to match the specified format so that simulations can be run by different engines, e.g. if you load AMBER files and want to run with GROMACS.)

@lohedges
Copy link
Member

lohedges commented Aug 1, 2023

I needed to modify a few things in your script in order to get things to work. (Directory names were wrong.) After this, I see the same thing for the coordinates, but I don't see the residue names changing.

The coordinates are being parsed incorrectly because the formatting of your GRO file is incorrect for the crystal waters in isolation, but is fine when they are combined with the protein, i.e.:

this is fine...

tail -n10 bug/bug/protein-amber14-all-fussion/protein_with_wt.gro'
  290NME     H3 4626   0.196   0.917   4.422
 3071COF    OWc    1   2.542  -1.493   2.303
 3071COF    HWc    2   2.597  -1.415   2.274
 3071COF    HWc    3   2.588  -1.539   2.378
 3134COF    OWc    4   1.466  -1.072   1.608
 3134COF    HWc    5   1.489  -0.996   1.669
 3134COF    HWc    6   1.510  -1.156   1.640
 3160COF    OWc    7   1.561  -1.546   2.008
 3160COF    HWc    8   1.570  -1.448   1.991
 3160COF    HWc    9   1.509  -1.587   1.932
   5.33840   5.74670   5.62630

but this isn't...

tail -n11 bug/bug/crystal-water-tip3p/wt_cofactor.gro
9
 3071COF   OWc    1   2.542  -1.493   2.303
 3071COF   HWc    2   2.597  -1.415   2.274
 3071COF   HWc    3   2.588  -1.539   2.378
 3134COF   OWc    4   1.466  -1.072   1.608
 3134COF   HWc    5   1.489  -0.996   1.669
 3134COF   HWc    6   1.510  -1.156   1.640
 3160COF   OWc    7   1.561  -1.546   2.008
 3160COF   HWc    8   1.570  -1.448   1.991
 3160COF   HWc    9   1.509  -1.587   1.932
   5.00000   5.00000   5.00000

If you fix the formatting of the wt_cofactor.gro file then the coordinates are parsed correctly and you end up with the correct values in the output file. I do think that there should be a warning/error on read, though, since it should be clear to the user when something isn't read correctly. Once this issue has been reposted I'll open a related issue for Sire so that we can debug the error handling in the Gro87 parser. Annoyingly GROMACS itself has recently changed the way that certain files are parsed and is inconsistent between tools within the GROMACS suite. Some things that were previously parsed using whitespace as a delimiter now require precise fixed-width formatting. This could be why the file isn't being loaded correctly, i.e we don't just assume the correct number of records per line, but the right layout of those records.

After modifying the crystal water GRO file I get:

tail -n10 test1.gro
 3071COF    OWc 4627   2.542  -1.493   2.303
 3071COF    HWc 4628   2.597  -1.415   2.274
 3071COF    HWc 4629   2.588  -1.539   2.378
 3134COF    OWc 4630   1.466  -1.072   1.608
 3134COF    HWc 4631   1.489  -0.996   1.669
 3134COF    HWc 4632   1.510  -1.156   1.640
 3160COF    OWc 4633   1.561  -1.546   2.008
 3160COF    HWc 4634   1.570  -1.448   1.991
 3160COF    HWc 4635   1.509  -1.587   1.932
   5.33840   5.74670   5.62630

...and

tail -n10 test2.gro
 3071COF    OWc 4627   2.542  -1.493   2.303
 3071COF    HWc 4628   2.597  -1.415   2.274
 3071COF    HWc 4629   2.588  -1.539   2.378
 3134COF    OWc 4630   1.466  -1.072   1.608
 3134COF    HWc 4631   1.489  -0.996   1.669
 3134COF    HWc 4632   1.510  -1.156   1.640
 3160COF    OWc 4633   1.561  -1.546   2.008
 3160COF    HWc 4634   1.570  -1.448   1.991
 3160COF    HWc 4635   1.509  -1.587   1.932
   5.33840   5.74670   5.62630

These are identical, i,e,:

diff <(tail -n10 test1.gro) <(tail -n10 test2.gro)

returns nothing.

@ale94mleon
Copy link
Author

ale94mleon commented Aug 1, 2023

Sorry for the wrong naming. Ok, probably I made a mistake during the creation of the gro file. I made that one manually. The change in name is not in the gro file but in the topology. But I understand why sire changes the name of water inside the topology. It is fair enough. However, a problem raises in case we would like to give a different treatment to crystal water during minimization and equilibration (e.g. restraint them) or analysis.

@lohedges
Copy link
Member

lohedges commented Aug 9, 2023

Closing since this is now tracked in OpenBioSim/biosimspace#148.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants