Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

Multiple changes before the release. #125

Merged
merged 29 commits into from
Oct 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
03c7fba
added node packages to gitignore
jspaezp Oct 23, 2023
5caa5d7
Merge branch 'feature/bruker_data' of github.com:TalusBio/quantms int…
jspaezp Oct 23, 2023
00e05d2
updated decompression
jspaezp Oct 23, 2023
28f7d5d
changed container for dotd2mqc
jspaezp Oct 23, 2023
fcb2054
Merge branch 'dev' into feature/bruker_data
jspaezp Oct 24, 2023
d49a1de
Merge pull request #312 from TalusBio/feature/bruker_data
ypriverol Oct 24, 2023
f62baf1
added retention
jspaezp Oct 24, 2023
2c3a6b5
Merge branch 'dev' into feature/retain_final_speclib
jspaezp Oct 25, 2023
32bbcf3
update dda and bruker data report
daichengxin Oct 27, 2023
5622769
Update mzml_statistics.py
daichengxin Oct 27, 2023
72a5085
Merge pull request #5 from bigbio/dev
daichengxin Oct 27, 2023
4dac3a1
Merge pull request #313 from TalusBio/feature/retain_final_speclib
ypriverol Oct 27, 2023
a892d81
Merge pull request #314 from daichengxin/dev
ypriverol Oct 27, 2023
30708dc
Merge pull request #1 from bigbio/dev
ypriverol Oct 27, 2023
49fd2ff
Update main.nf
ypriverol Oct 27, 2023
6210ef7
Update main.nf
ypriverol Oct 27, 2023
bf520e1
Update modules/local/openms/thirdparty/searchenginemsgf/main.nf
ypriverol Oct 27, 2023
07f115e
Modify Luciphor2 module.
ypriverol Oct 27, 2023
0568535
Merge pull request #316 from ypriverol/dev
ypriverol Oct 27, 2023
8b31b61
Update CHANGELOG.md
daichengxin Oct 30, 2023
0353f8f
improving doc
daichengxin Oct 30, 2023
105602e
Merge branch 'dev' into dev
ypriverol Oct 30, 2023
b51874e
Merge pull request #319 from daichengxin/dev
ypriverol Oct 30, 2023
0b5c3ad
fixed spectrum reference for bruker data
daichengxin Oct 30, 2023
d118c6c
fixed
daichengxin Oct 30, 2023
92a4cad
Delete test.py
daichengxin Oct 30, 2023
52c54fc
del
daichengxin Oct 30, 2023
bd18f48
Merge branch 'dev' into dev
ypriverol Oct 30, 2023
32292a7
Merge pull request #321 from daichengxin/dev
ypriverol Oct 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,9 @@ testing*
/build/
results*/
venv/
node_modules
conversion_inputs
debug_dir
test_out

lint_log.txt
6 changes: 4 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Changed`

- Update for pmultiqc to pmultiqc=0.0.21
- Update for openms to openms=3.1.0
- [#314](https://github.com/bigbio/quantms/pull/314) Update for pmultiqc to pmultiqc=0.0.23
- [#308](https://github.com/bigbio/quantms/pull/308) Update for openms to openms=3.1.0
- Update for sdrf-pipelines to sdrf-pipelines=0.0.24
- Update for msstats to msstats=4.2.1

### `Fixed`

- [#316](https://github.com/bigbio/quantms/pull/316) Fixed jar path selection of luciphoradapter and msgf+
- Fixed bug where modification masses were not calculated correctly in DIA-NN conversion.
- Fixed multiple bugs Pull Request [#293 BigBio](https://github.com/bigbio/quantms/pull/293), [#279 BigBio](https://github.com/bigbio/quantms/pull/279), [#265 BigBio](https://github.com/bigbio/quantms/pull/265), [#260 BigBio](https://github.com/bigbio/quantms/pull/260), [#257 BigBio](https://github.com/bigbio/quantms/pull/257)

Expand All @@ -36,6 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- lfq_intensity_threshold: Minimum intensity of a feature to be considered in the MBR algorithm (default: 1000)
- sage_processes: Number of processes to use in SAGE search engine (default: 1)
- diann_speclib: Path to the spectral library to use in DIA-NN (default: null)
- convert_dotd: if convert .d file to mzml (default: false)

### `Deprecations`

Expand Down
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,11 @@ On release, automated continuous integration tests run the pipeline on a full-si

### DIA-LFQ (data-independent label-free quantification)

1. RAW file conversion to mzML ([`thermorawfileparser`](https://github.com/compomics/ThermoRawFileParser))
2. DIA-NN analysis [`dia-nn`](https://github.com/vdemichev/DiaNN/)
3. Generation of output files (msstats)
4. QC reports generation [`pmultiqc`](https://github.com/bigbio/pmultiqc)
1. RAW file conversion to mzML when RAW as input([`thermorawfileparser`](https://github.com/compomics/ThermoRawFileParser))
2. Performing an [optional step](https://github.com/bigbio/quantms/blob/dev/modules/local/tdf2mzml/main.nf): Converting .d to mzML when bruker data as input and set `convert_dotd` to true
3. DIA-NN analysis [`dia-nn`](https://github.com/vdemichev/DiaNN/)
4. Generation of output files (msstats)
5. QC reports generation [`pmultiqc`](https://github.com/bigbio/pmultiqc)

### Functionality overview

Expand Down
16 changes: 11 additions & 5 deletions bin/diann_convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def convert(ctx, folder, exp_design, dia_params, diann_version, charge, missed_c

:param folder: DiannConvert specifies the folder where the required file resides. The folder contains
the DiaNN main report, protein matrix, precursor matrix, experimental design file, protein sequence
FASTA file, version file of DiaNN and mzml_info TSVs
FASTA file, version file of DiaNN and ms_info TSVs
:type folder: str
:param dia_params: A list contains DIA parameters
:type dia_params: list
Expand Down Expand Up @@ -252,8 +252,8 @@ def fasta(self) -> os.PathLike:
return self.find_suffix_file(".fa")

@property
def mzml_info(self) -> os.PathLike:
return self.find_suffix_file("mzml_info.tsv")
def ms_info(self) -> os.PathLike:
return self.find_suffix_file("ms_info.tsv")

@property
def validate_diann_version(self) -> str:
Expand Down Expand Up @@ -826,7 +826,7 @@ def mztab_PSH(report, folder, database):
:type report: pandas.core.frame.DataFrame
:param folder: DiannConvert specifies the folder where the required file resides. The folder contains
the DiaNN main report, protein matrix, precursor matrix, experimental design file, protein sequence
FASTA file, version file of DiaNN and mzml_info TSVs
FASTA file, version file of DiaNN and ms_info TSVs
:type folder: str
:param database: Path to fasta file
:type database: str
Expand All @@ -837,7 +837,7 @@ def mztab_PSH(report, folder, database):

def __find_info(dir, n):
# This line matches n="220101_myfile", folder="." to
# "myfolder/220101_myfile_mzml_info.tsv"
# "myfolder/220101_myfile_ms_info.tsv"
files = list(Path(dir).glob(f"*{n}*_info.tsv"))
# Check that it matches one and only one file
if not files:
Expand All @@ -860,6 +860,12 @@ def __find_info(dir, n):
group.sort_values(by="RT.Start", inplace=True)
target = target[["Retention_Time", "SpectrumID", "Exp_Mass_To_Charge"]]
target.columns = ["RT.Start", "opt_global_spectrum_reference", "exp_mass_to_charge"]
# Standardize spectrum identifier format for bruker data
if type(target.loc[0, "opt_global_spectrum_reference"]) != str:
target.loc[:, "opt_global_spectrum_reference"] = "scan=" + target.loc[
:, "opt_global_spectrum_reference"
].astype(str)

# TODO seconds returned from precursor.getRT()
target.loc[:, "RT.Start"] = target.apply(lambda x: x["RT.Start"] / 60, axis=1)
out_mztab_PSH = pd.concat([out_mztab_PSH, pd.merge_asof(group, target, on="RT.Start", direction="nearest")])
Expand Down
Loading