Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add retip tool suit wrapper(s) #3

Closed
wants to merge 30 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
da5b70f
Add retip tool suit wrapper(s) for deployment into TTS.
smartx-usman Sep 7, 2020
73a5b87
add aplcms
xtracko Sep 17, 2020
890ce18
switch to public image
martenson Sep 17, 2020
5a9decb
add xmsannotator
xtracko Sep 18, 2020
93745d1
adjust tool version so it corresponds to the downstream software
martenson Sep 18, 2020
53bd11c
Changed galaxy xml file to use conda biotransformer package
trachtok Sep 21, 2020
0d0df54
update versioning to iuc standard
martenson Sep 22, 2020
459f828
Changed galaxy xml file to use conda biotransformer package
trachtok Sep 21, 2020
0771801
Use proper versioning semantics. Add basic tests and associated files…
smartx-usman Sep 22, 2020
feeab17
Correction of flake8 errors in Python wrapper
trachtok Sep 22, 2020
c6918a6
lint omport order
martenson Sep 22, 2020
8338173
Merge pull request #7 from RECETOX/biot_up
martenson Sep 22, 2020
53a1a67
Remove the unnecesarry data tables
Sep 23, 2020
d5516ef
Populate help section and adapt wrappers to the package API changes
Sep 23, 2020
9c62378
Merge pull request #6 from RECETOX/xmsannotator
xtracko Sep 23, 2020
3ab532b
Migrate towards HDF5 outputs
Sep 23, 2020
5305d1a
Merge pull request #5 from RECETOX/aplcms
xtracko Sep 24, 2020
f4d7aa1
Change error detection policy
Sep 24, 2020
277cb8e
Fix requerement
Sep 24, 2020
b6bca87
Merge pull request #9 from RECETOX/aplcms
xtracko Sep 24, 2020
f92295d
Merge pull request #10 from RECETOX/xmsannotator
xtracko Sep 24, 2020
ef7d82e
Cleaned xml wrapper and added test.
trachtok Sep 24, 2020
c98db22
Repaired broken xml tags
trachtok Sep 24, 2020
4ce18f4
More cleaning of xml wrapper.
trachtok Sep 25, 2020
9465281
Merge pull request #12 from RECETOX/biot_up
martenson Sep 25, 2020
bcd9f6f
Add retip tool suit wrapper(s) for deployment into TTS.
smartx-usman Sep 7, 2020
d30d338
Use proper versioning semantics. Add basic tests and associated files…
smartx-usman Sep 22, 2020
0532b80
Fix tool version.
smartx-usman Sep 29, 2020
11db27e
Remove lines diff for binary files.
smartx-usman Sep 29, 2020
171af41
Merge remote-tracking branch 'origin/master'
smartx-usman Sep 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions tools/retip/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
categories:
- Metabolomics
owner: "recetox"
remote_repository_url: "https://github.com/RECETOX/galaxytools/tree/master/tools/retip"
homepage_url: "https://github.com/PaoloBnn/Retip"
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "The tool {{ tool_name }} from the Retip tool suite."
suite:
name: "suite_retip"
description: "A suite of Retip (Retention Time Prediction for metabolomics) tools."
long_description: |
"Retip is an R package for predicting Retention Time (RT) for small molecules in a
high pressure liquid chromatography (HPLC) Mass Spectrometry analysis. Retention
time calculation can be useful in identifying unknowns and removing false positive annotations.
37 changes: 37 additions & 0 deletions tools/retip/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<macros>
<token name="@VERSION@">1.0.0</token>
martenson marked this conversation as resolved.
Show resolved Hide resolved
martenson marked this conversation as resolved.
Show resolved Hide resolved
<xml name="requirements">
<requirements>
<container type="docker">registry.gitlab.ics.muni.cz:443/recetox/mass-spectrometry/retip:@VERSION@</container>
</requirements>
</xml>
<xml name="citations">
<citations>
<citation type="doi">https://doi.org/10.1021/acs.analchem.9b05765</citation>
</citations>
</xml>
<token name="@HELP@"><![CDATA[
Retip is an R package for predicting Retention Time (RT) for small molecules in a high pressure liquid
chromatography (HPLC) Mass Spectrometry analysis. Retention time calculation can be useful in identifying
unknowns and removing false positive annotations. It uses five different machine learning algorithms to built a
stable, accurate and fast RT prediction model:

- Random Forest: a decision tree algorithms
- BRNN: Bayesian Regularized Neural Network
- XGBoost: an extreme Gradient Boosting for tree algorithms
- lightGBM: a gradient boosting framework that uses tree based learning algorithms.
- Keras: a high-level neural networks API for Tensorflow

Retip also includes useful biochemical databases like: BMDB, ChEBI, DrugBank, ECMDB, FooDB, HMDB, KNApSAcK,
PlantCyc, SMPDB, T3DB, UNPD, YMDB and STOFF.

**Get started**

To use Retip, a user needs to prepare a compound retention time library. The input file
needs compound Name, InChiKey, SMILES code and experimental retention time information for each compound.
The input must be a CSV file. Retip will use this input file to build a the model and will predict
retention times for other biochemical databases or an input query list of compounds. It is suggested that
the file has at least 300 compounds to build a good retention time prediction model.
]]>
</token>
</macros>
31 changes: 31 additions & 0 deletions tools/retip/retip_apply.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<tool id="retip_apply" name="Retip prediction" version="@VERSION@">
martenson marked this conversation as resolved.
Show resolved Hide resolved
<description>is retention time predictor for Metabolomics</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements"/>
<command detect_errors="exit_code"><![CDATA[
/run.sh spell.R '$descr_train' '$model_hdf5' '$input' 'SMILES'
]]>
</command>

<inputs>
<param name="descr_train" label="Select Descriptors.Feather Dataset" type="data" format="feather"
optional="false"/>
<param name="model_hdf5" label="Select Model.hdf5 Dataset" type="data" format="h5" optional="false"/>
<param name="input" label="Select Input Dataset" type="data" format="tabular" optional="false"/>
</inputs>

<outputs>
<data format="tabular" name="output" label="Predicted RT" from_work_dir="SMILES"/>
</outputs>
<help><![CDATA[
.. class:: infomark

This tool is used for **Retention Time Prediction** on a whole database.

@HELP@
]]>
</help>
<expand macro="citations"/>
</tool>
29 changes: 29 additions & 0 deletions tools/retip/retip_descriptors.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<tool id="retip_descriptors" name="Retip chemical descriptors" version="@VERSION@">
martenson marked this conversation as resolved.
Show resolved Hide resolved
<description>for retention time prediction</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements"/>
<command detect_errors="exit_code"><![CDATA[
/run.sh chemdesc.R '$compounds' 'descriptors.feather'
]]>
</command>

<inputs>
<param name="compounds" label="Select Compounds Dataset" type="data" format="tabular" optional="false"/>
</inputs>

<outputs>
<data format="feather" name="output_file1" label="Descriptors.Feather Dataset"
from_work_dir="descriptors.feather"/>
</outputs>
<help><![CDATA[
.. class:: infomark

This tool **computes chemical descriptors** with CDK a JAVA based open source project aimed at cheminformatics.

@HELP@
]]>
</help>
<expand macro="citations"/>
</tool>
29 changes: 29 additions & 0 deletions tools/retip/retip_train.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<tool id="retip_train" name="Retip training" version="@VERSION@">
martenson marked this conversation as resolved.
Show resolved Hide resolved
<description>the Keras model to predict retention times</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements"/>
<command detect_errors="exit_code"><![CDATA[
/run.sh trainKeras.R '$descr_train' 'model.hdf5'
]]>
</command>

<inputs>
<param name="descr_train" label="Select Descriptors.Feather Dataset" type="data" format="feather"
optional="false"/>
</inputs>

<outputs>
<data format="h5" name="output_file2" label="Model.hdf5 Dataset" from_work_dir="model.hdf5"/>
</outputs>
<help><![CDATA[
.. class:: infomark

This tool uses ALMA mater: Advanced Learning Machine Algorithms to **train models**.

@HELP@
]]>
</help>
<expand macro="citations"/>
</tool>