Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disordered region query #2239

Closed
ValWood opened this issue Sep 25, 2024 · 10 comments
Closed

disordered region query #2239

ValWood opened this issue Sep 25, 2024 · 10 comments

Comments

@ValWood
Copy link
Member

ValWood commented Sep 25, 2024

Pro1 has a "disordered region" feature of 2 amino acids. This seems odd to me. I'm not sure you can call disordered for 2 amino acids.
What is the source of this (I think they come from InterPRot, but I don't see it on the InterPro website)

@kimrutherford
Copy link
Member

The coiled coils, disordered regions and low complexity regions came from Pfam while it was still existed:

That two AA region has type "IUPred" in the data from Pfam. Probably from this tool: https://iupred3.elte.hu/
(Although I tried the protein on the site just now and didn't predict anything)

@ValWood

This comment was marked as outdated.

@kimrutherford
Copy link
Member

Do we have other sources of disordered region?

We don't have any other sources at the moment. I'm still trying and failing to run the InterPro pipeline locally. But I don't know if that has any disorder or complexity output.

This seems a bit flaky, (don't we also get them from InterPRo?)

They all come from Pfam (3 years ago) not from InterPro.

@ValWood

This comment was marked as outdated.

@kimrutherford
Copy link
Member

If we can't locate them,

They are definitely from the data I grabbed from Pfam before it disappeared.

I will ask InterPro if they can include them in InterProscan.

InterPro does provide disordered regions in a file (but not low complexity regions). I found an old email titled "[pfam-help #751585] Pfam-N" where you asked InterPro about it. They said:

From the InterPro FTP: extra.xml.gz contains all sequence features, such as trensmembrane regions predicted by TMHMM, intrinsically disordered regions predicted by MobiDB-lite, etc. Pfam-N regions are also included in this file. There are more Pfam-N regions that the first file as it's based on UniProtKB 2024_03.

I think we talked about it and decided it was still a problem because they have out of date protein sequence so the positions will sometimes be wrong. I'm looking at installing InterPro locally.

@kimrutherford
Copy link
Member

Pro1 has a "disordered region" feature of 2 amino acids.

InterProScan doesn't find any disordered regions in pro1

I see similar here in wee1

The 2 AA match for wee1 is gone too.

The predictions from InterProScan are quite different to the prediction from Pfam.

This is from Pfam for wee1:

image

Here's what we get from InterProScan, on the same scale:

image

A bit worrying maybe?

The screenshots are from: https://desktop.kmr.nz/gene/SPCC18B5.03

@ValWood
Copy link
Member Author

ValWood commented Oct 7, 2024

I think it is OK, wee 1 is largely disordered outside of the kinase domain:

Screenshot 2024-10-07 at 07 16 46

I guess InterPro have chosen the algorithm they believe performs the best. I wonder if this incorporates AlphaFold data (that would be the sensible way to do it...)

@ValWood
Copy link
Member Author

ValWood commented Oct 10, 2024

I think we can close this, it's covered by [MobiDB-lite] features.
I was just going to send an announcement about that, but they aren't live yet. Which ticket is that? (That's what I was looking for when I found this.)

It seems from this ticket that the old disordered regions were from various sources that were no longer reproducible:

"That two AA region has type "IUPred" in the data from Pfam. Probably from this tool: https://iupred3.elte.hu/
(Although I tried the protein on the site just now and didn't predict anything)"

@ValWood ValWood closed this as completed Oct 10, 2024
@kimrutherford
Copy link
Member

@kimrutherford
Copy link
Member

I guess the new domains panel can go together with the switch

That's what I was thinking. There are a lot of interrelated changes in various places so it's easier just to get everything done and released at once rather than bit by bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants