Skip to content
/ HeuSMA Public

A Heuristic Strategy for Metabolomics Analysis based on multiple chromatographic gradients to enhance metabolite coverage in untargeted metabolomics analysis.

License

Notifications You must be signed in to change notification settings

Lacterd/HeuSMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

HeuSMA

A Heuristic Strategy for Metabolomics Analysis based on multiple chromatographic gradients to enhance metabolite coverage in untargeted metabolomics analysis.

Research background

Metabolomics is a vital tool in systems biology, providing deep insights into metabolic processes. It aims to catalog and quantify metabolites in biological systems, aiding in understanding metabolic diversity and complex biochemical networks. Traditional untargeted metabolomics using liquid chromatography-mass spectrometry (LC-MS) achieves less than 5% coverage due to the diverse and complex nature of metabolites. To address this, we propose a heuristic strategy (HeuSMA) using multiple chromatographic gradients to enhance coverage. HeuSMA integrates LC-MS data, optimizes peak picking, and increases the acquisition rate of high-quality MS/MS spectra. To enhance usability, we have developed two softwares (Heuristic Peak List Generator as HPLG, Heuristic Processor as HP)that simplifies the analysis process by integrating and automating all steps from the import of raw data to the export of final results.

System requirements

The software is developed based on Python version 3.10.

Installation

Clone the code from this page, or donwload the released software (recommend).

Tutorial

The HPLG facilitates the generation of heuristic peak list through three main functions: selecting the LC-MS data and the peak list, choosing an Excel file of RI calibrant list, and configuring computational process parameters. The HP provides support for heuristic peak picking and assignment of MS/MS spectra to their corresponding peaks. It incorporates five essential functions: selecting the LC-MS data, specifying the data type, choosing an Excel file of RI calibrants list, selecting the heuristic peak list file, and configuring computational process parameters.

Tutorial for HPLG:

File Preparation:

The process of running the HPLG software requires mzML data, an Excel file of calibrants list, and Excel files of peak lists (as Preparation file format). Before establishing the heuristic peak list, ensure all peak list files and mzML files are stored in the same folder. Peak list files and mzML files from the same LC gradient should be named identically (e.g., sample1.xlsx & sample1.mzML).

Selecting Data of Samples:

  1. Click the top “Select” button and choose a mzML file. The corresponding peak list file will be automatically selected.
  2. All selected files will be displayed in the table next to the top “Select” button. Selecting the Calibrants File: Click the second “Select” button and choose the calibrants file in *.xlsx format. The name of the selected file will be displayed on the left side.

Setting Parameters:

Enter the MS1 Tolerance and RI Tolerance values directly into their corresponding input boxes.

Running the Program:

Click the “Run” button to start the establishment of the heuristic peak list.

Output Results:

After the process is completed, an Excel file containing the heuristic peak list and its corresponding pickle file will be exported to the folder where the mzML files are located. These files are intended for subsequent data analysis and processing.

Tutorial for HP:

File Preparation:

The process of running the HP software requires mzML data, an Excel file of calibrants list (as Preparation file format), and a pickle file of heuristic peak lists generated by HPLG.

Selecting Sample Files:

Click the top “Select” button to choose mzML. All selected files will be displayed in the adjacent table. In the “Type” column, click the button to choose the data type for each file; the default type is “Sample”. Group information and the analytical order can be set by texting in the corresponding boxes in the “Group” column.

Selecting the Calibrants List:

Click the middle “Select” button to choose the calibrants list file in *.xlsx format. The name of the selected file will be displayed on the left side.

Selecting the Heuristic Peak List:

Click the last “Select” button to choose the heuristic peak list file in *.pkl format, generated by the HPLG software. The name of the selected file will also be displayed on the left side.

Setting Parameters:

All parameters can be set by directly texting in the corresponding boxes. Running the Program: Click the “Run” button to start the whole program.

Output Results:

The alignment list and peak list files will be exported to the folder containing the mzML files. A *.csv file and a *.mgf file will be exported as well.

Preparation file format

Excel file of calibrants list for HPLG and HP:

The first line of Excel file must contains “C” and “m/z”. “C” represent carbon numbers of the fatty acid chain of the calibrants and “m/z” column indicates the corresponding m/z values of calibrants.

C m/z
3 132.0655
4 146.0812
... ...

Excel file of Peak list for HPLG:

The first line of Excel file of m/z column must be “m/z” and the title of retention time column must be “RT”.

m/z RT
m/z - 1 RT - 1
m/z - 2 RT - 2
... ...

About

A Heuristic Strategy for Metabolomics Analysis based on multiple chromatographic gradients to enhance metabolite coverage in untargeted metabolomics analysis.

Topics

Resources

License

Stars

Watchers

Forks

Languages