-
-
Notifications
You must be signed in to change notification settings - Fork 131
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Development of
PDBManager
Class (WIP) (#272)
* add PDB manager #270 * add download method * add clustering utilities * Add dataset splits functionality and add new documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve merge conflicts with remote * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused test * Address lingering SonarCloud concerns * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add deposition date parsing * remove pdb.py * add chain extraction util * add chain writing method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * After fixing merge conflicts, add more filters and add time-based splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix up SonarCloud concerns * Improve verbiage surrounding PDB resolutions * Simplify code and improve variable names * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Track names of splits in df_splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix column naming during merging of DataFrame splits * add additional properties * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactor clustering to allow file caching and overwriting * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add description to assert statements * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add extra documentation around clustering function, and address small formatting issues * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add method to write selection to CSV * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve from_fasta documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Enable code reuse for length filters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor documentation changes to FASTA write-out function * Add ability to perform most API calls for a subset of splits * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update .gitignore * Fix missing download call, and add more documentation to download functions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix small bug when merging different splits together * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix bug in length filtering functions, fix print bugs in utils, and add ability to write-out PDB files after selecting a subset of chains to include in them * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix string formatting * Update PDB write-out logic and documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add PDB download workaround for PDBs that can no longer be downloaded * Make exception more specific * Add TQDM for data split exporting * Enable PDBManager root to be set to an arbitrary location * add initial tests * update changelog * add tutorial notebook * Allow all chains in a complex to be exported together * add module-level import * Remove old, unused PDBManager prototype file * add parsing & checks for unavailable PDB structures * fix download checker * actually fix download checker * add availability filter * Default to export model 1's chains only in PDBManager, and clean-up notebook and utilities * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add tutorial nblink * add tutorial to datasets sections * mv pdb data to ml API * rm pyg dataset import * rm unused code * fix annotation * add MMTF download format * refactor dependency utils * refactor graphein.utils.utils.import_message * refactor graphein.protein.utils.is_tool * update .gitignore * ignore cif too * ignore cif too * ignore foldcomp files * catch straggling erroneous imports * ignore mol2 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update folding utils * add max batch option * add foldcomp utils * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add notebook updates [WIP] * move manager class into graphein.ml * remove datasets init * fix import util refactor I didn't catch * add PDBmanager to __init__ * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix oligomeric filtering * update notebook * fix dataset init * fix protein.coord renaming in tensor module * add try/except to pyg-related datasets * add try/except to pyg-related datasets * add mmseqs to CI build * rollback dssp install to conda * ignore pdb manager notebook in minimal tests * fix code smell * fix metrics * shorten line lengths * add minimum scipy version * remove python 3.7 from CI * Add Torch 2.0.0 to CI * add note about multiple split strategies * add torch cluster install to CI * update dockerfile to torch 2.0 * switch docker pytorch 1.13 for VMD python version conflict * switch out torchtyping for jaxtyping * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update tensor shape syntax for jaxtyping * remove torch-dependent tests from minimal install testing * update test ignores * install dssp from apt, rather than conda in docker * update typing extensions version --------- Co-authored-by: Arian Jamasb <arjamasb@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
0b24a20
commit 83295a8
Showing
52 changed files
with
387,028 additions
and
385,129 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
"path": "../../../notebooks/creating_datasets_from_the_pdb.ipynb" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,10 @@ | ||
from .torch_geometric_dataset import ( | ||
InMemoryProteinGraphDataset, | ||
ProteinGraphDataset, | ||
ProteinGraphListDataset, | ||
) | ||
from .pdb_data import PDBManager | ||
|
||
try: | ||
from .torch_geometric_dataset import ( | ||
InMemoryProteinGraphDataset, | ||
ProteinGraphDataset, | ||
ProteinGraphListDataset, | ||
) | ||
except (NameError, ImportError): | ||
pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.