Skip to content

Assessment of lemon juice adulteration by UHPLC-QqQ-MS/MS with interactive and interpretable machine learning

Notifications You must be signed in to change notification settings

yuanboFaith/Lemon_Juice_Classification2

Repository files navigation

Assessment of lemon juice quality and adulteration by ultra-high performance liquid chromatography/triple quadrupole mass spectrometry with interactive and interpretable machine learning

Check the original article published in Journal of Food and Drug Analysis.

Abstract

A total of 81 lemon juices samples were detected using an optimized UHPLC-QqQ-MS/MS method and colorimetric assays. Concentration of 3 organic acids (ascorbic acid, malic acid and citric acid), 3 saccharides (glucose, fructose and sucrose) and 6 phenolic acids (trans-p-coumaric acid, 3-hydroxybenzoic acid, 4-hydroxybenzoic acid, 3,4-dihydroxybenzoic acid, caffeic acid) were quantified. Their total polyphenol, antioxidant activity and Ferric reducing antioxidant power were also measured. For the prediction of authentic and adulterated lemon juices and commercially sourced lemonade beverages based on the acquired metabolic profile, machine learning models including linear discriminant analysis, Gaussian naïve Bayes, lasso-regularized logistic regression, random forest (RF) and support vector machine were developed based on training (70%)-cross-validation-testing (30%) workflow. The predicted accuracy on the testing set is 73–86% for different models. Individual conditional expectation analysis (how predicted probabilities change when the feature magnitude changes) was applied for model interpretation, which in particular revealed the close association of RF-probability prediction with nuance characteristics of the density distribution of metabolic features. Using established models, an open-source online dashboard was constructed for convenient classification prediction and interactive visualization in real practice.

Script Reference

The R script in this documentation covers data wrangling, visualization, machine learning modeling, and Shiny App construction developed in this original publication. Check here to find the script and associated output.

The R code has been developed with reference to R for Data Science (2e), and the official documentation of tidyverse, and DataBrewer.co. See breakdown of modules below:

  • Data visualization with ggplot2 (tutorial of the fundamentals; and data viz. gallery).

  • Data wrangling with the following packages: tidyr: transform (e.g., pivoting) the dataset into tidy structure; dplyr: the basic tools to work with data frames; stringr: work with strings; regular expression: search and match a string pattern; purrr: functional programming (e.g., iterating functions across elements of columns); and tibble: work with data frames in the modern tibble structure.

Follow me. Keep Updated with My Latest Research

Medium Logo X Logo DataBrewer Logo Harvard Public Health Logo ORCID Logo LinkedIn Logo Academia Logo Google Scholar

About

Assessment of lemon juice adulteration by UHPLC-QqQ-MS/MS with interactive and interpretable machine learning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published