Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure with 1 (or odd number) of Cysteine breaks add_salt_bridges #241

Closed
universvm opened this issue Nov 30, 2022 · 12 comments
Closed

Comments

@universvm
Copy link

Describe the bug
Trying to convert a structure with 1 or odd Cysteine numbers causes graphein to return ValueErrors. I

To Reproduce
Insert a pdb with 1 cysteine through construct_graphs with add_salt_bridges.

  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/graphein/protein/graphs.py", line 587, in compute_edges
    func(G)
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/graphein/protein/edges/distance.py", line 753, in add_salt_bridges
    distmat = compute_distmat(salt_bridge_df)
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/graphein/protein/edges/distance.py", line 74, in compute_distmat
    eucl_dists.index = pdb_df.index
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/pandas/core/generic.py", line 5915, in __setattr__
    return object.__setattr__(self, name, value)
  File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/pandas/core/generic.py", line 823, in _set_axis
    self._mgr.set_axis(axis, labels)
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 230, in set_axis
    self._validate_set_axis(axis, new_labels)
  File "{REDACTED}.conda/envs/3dtcr/lib/python3.8/site-packages/pandas/core/internals/base.py", line 70, in _validate_set_axis
    raise ValueError(
ValueError: Length mismatch: Expected axis has 1 elements, new values have 0 elements

Expected behavior
It would be clearer if there was a check on the number of cysteine available rather than a generic valueerror

Desktop (please complete the following information):

  • OS: Linux
  • Python Version: 3.8
@universvm
Copy link
Author

Same thing happens with add_aromatic_interactions

@a-r-j
Copy link
Owner

a-r-j commented Dec 17, 2022

Hi @universvm do you have an example PDB code / code to reproduce? I'm not fully convinced it's due to the Cysteines.

@universvm
Copy link
Author

Hi @a-r-j

I think this happens because there is no distance between cysteines if there is only one cysteine, hence the "new values have 0 elements"

Have a look at these two files generated with ESMFold

Archive.zip

@a-r-j
Copy link
Owner

a-r-j commented Dec 18, 2022

Hmm, I'm struggling to reproduce this error on graphein 1.5.2

import graphein
import graphein.protein as gp

print(graphein.__version__)

config = gp.ProteinGraphConfig(
    edge_construction_functions=[gp.add_salt_bridges, gp.add_aromatic_interactions]
)

g = gp.construct_graph(pdb_path="1CYS.pdb", config=config)
print(g.nodes)

for u,v,d in g.edges(data=True):
    print(u,v,d)
    
gp.add_salt_bridges(g)

for u, v, d in g.edges(data=True):
    print(u, v, d)
    
gp.add_aromatic_interactions(g)

for u, v, d in g.edges(data=True):
    print(u, v, d)

@universvm
Copy link
Author

Can you try with this sequence instead?
CIVRAPGRADMRF.pdb.zip

@a-r-j
Copy link
Owner

a-r-j commented Dec 21, 2022

Runs fine for me 😕

@universvm
Copy link
Author

This is the portion of my code I'm allowed to share:

graphein_params = {
    "edge_construction_functions": [
        add_peptide_bonds,
        add_hydrogen_bond_interactions,
        add_disulfide_interactions,
        add_ionic_interactions,
        add_vdw_interactions,
        add_salt_bridges,
    ],
    "edge_labels": edge_labels,
    "node_metadata_functions": [
        meiler_embedding,
        amino_acid_one_hot,
        hydrogen_bond_donor,
        hydrogen_bond_acceptor,
    ],
    "dssp_config": gp.DSSPConfig(),
}

edge_labels = [{"peptide_bonds","hydrogen_bond","disulfide","ionic","vdw","salt_bridges"}]
g = construct_graph(config=config, pdb_path=str(path_to_pdb))
g = g.to_undirected()
# Add DSSP features - in the future this will be done in configs https://github.com/a-r-j/graphein/issues/239
g = add_dssp_feature(g, feature="phi")

The code breaks before I get to the g = g.to_undirected()

@a-r-j
Copy link
Owner

a-r-j commented Dec 23, 2022

I still can't reproduce this in a clean environment with any of the provided files:

# !pip install graphein
# !conda install -c salilab dssp
import graphein.protein as gp

#path_to_pdb = "CIVRAPGRADMRF.pdb"
#path_to_pdb = "2CYS.pdb"
path_to_pdb = "1CYS.pdb"

edge_labels = [{"peptide_bonds","hydrogen_bond","disulfide","ionic","vdw","salt_bridges"}]
graphein_params = {
    "edge_construction_functions": [
        gp.add_peptide_bonds,
        gp.add_hydrogen_bond_interactions,
        gp.add_disulfide_interactions,
        gp.add_ionic_interactions,
        gp.add_vdw_interactions,
        gp.add_salt_bridges,
    ],
    "edge_labels": edge_labels,
    "node_metadata_functions": [
        gp.meiler_embedding,
        gp.amino_acid_one_hot,
        gp.hydrogen_bond_donor,
        gp.hydrogen_bond_acceptor,
    ],
    "dssp_config": gp.DSSPConfig(),
}

config = gp.ProteinGraphConfig(**graphein_params)

g = gp.construct_graph(config=config, pdb_path=str(path_to_pdb))
g = g.to_undirected()
# Add DSSP features - in the future this will be done in configs https://github.com/a-r-j/graphein/issues/239
g = gp.add_dssp_feature(g, feature="phi")

@universvm Could you confirm your python & graphein versions? I tested the above code with graphein 1.5.2 (and, from main) on Python 3.9.

@universvm
Copy link
Author

I'm using: graphein 1.5.2 and Python 3.8.13

Apologies I'm currently away so don't have access to a good internet connection. I'll try it out again when I'm back. The issue seemed to happen uniquely if I used add_salt_bridges and add_aromatic_interactions in sequences that had an odd number of either of the residues.

Will try again in a week or so!

@universvm
Copy link
Author

Hey @a-r-j , I am able to reproduce the bug in the original post with this structure and the code you wrote:
CAVSGSGQFYF.pdb.zip

@a-r-j
Copy link
Owner

a-r-j commented Jan 2, 2023

I've managed to reproduce it too! The problem was this has already been fixed in #220 but not pushed to PyPi yet.

I'll make a release this week with some new updates. In the meantime, installing from master should do the trick :)

@a-r-j
Copy link
Owner

a-r-j commented Mar 18, 2023

Now resolved in 1.6.0

pip install graphein=1.6.0

@a-r-j a-r-j closed this as completed Mar 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants