Additional adduct forms for crest #2

tobigithub · 2020-02-18T19:03:55Z

Is your feature request related to a problem? Please describe.
The current crest version (Version 2.8.1, Fri 20. Dec 13:44:46 CET 2019 ) only allows for protonized [M+H]+ and deprotonized adduct forms [M-H]-. While these are the most important adducts electrospray mass spectrometry (ESI-MS) they are only two out of 300 different possible ions that are routinely observed. In the past many of these other adducts were simply ignored or found less important. With the advent of accurate mass spectrometry, many experiments and software solutions now report all additional species.

Describe the solution you'd like
A crest version that allows for additional adduct ions selected in the command line or in the parameter file, or input file.

Describe alternatives you've considered
None.

If possible state how you can assist in providing data or code to to implement the feature
Can provide list of most common adducts, for ESI-MS based on thousands of spectra.

Additional context
Maybe starting with the most common ones in the list, that do not require additional complex coding would be good. Basically replacing {H} with {Na} or similar. That means, complicated species with multiple ions or water losses, could be ignored in the beginning, but those with [M+Na]+, [M+Li]+ or [M.Cl]- or [M+NH4]+ could be easily added. The list contains 300 adducts, I only added the top 30.

| #  | Adduct name  | Count | Percent  |   |
|----|--------------|-------|----------|---|
| 1  | [M+H]+       | 98521 | 62.55381 |   |
| 2  | [M+2H]2+     | 18025 | 11.44459 |   |
| 3  | [M+H-H2O]+   | 13822 | 8.77598  |   |
| 4  | [M-H]-       | 9847  | 6.25214  |   |
| 5  | [M+Na]+      | 8679  | 5.51055  |   |
| 6  | [M+H-NH3]+   | 1882  | 1.19494  |   |
| 7  | [M+NH4]+     | 1161  | 0.73715  |   |
| 8  | [M-H-H2O]-   | 545   | 0.34604  |   |
| 9  | [M-H+2Na]+   | 519   | 0.32953  |   |
| 10 | [M-H+H2O]-   | 386   | 0.24508  |   |
| 11 | [M+NH4-H2O]+ | 362   | 0.22984  |   |
| 12 | [M+H+H2O]+   | 306   | 0.19429  |   |
| 13 | [M+H+Na]2+   | 288   | 0.18286  |   |
| 14 | [M+H+K]2+    | 276   | 0.17524  |   |
| 15 | [M-2H]2-     | 220   | 0.13968  |   |
| 16 | [M+2Na]2+    | 217   | 0.13778  |   |
| 17 | [M+2H-NH3]2+ | 216   | 0.13714  |   |
| 18 | [M+K]+       | 215   | 0.13651  |   |
| 19 | [M+H-2H2O]+  | 186   | 0.11810  |   |
| 20 | [M+3H]3+     | 105   | 0.06667  |   |
| 21 | [M+2H-H2O]2+ | 102   | 0.06476  |   |
| 22 | [M]+.        | 93    | 0.05905  |   |
| 23 | [M+2Na-H]+   | 81    | 0.05143  |   |
| 24 | [M-H+2K]+    | 80    | 0.05079  |   |
| 25 | [M+H-CO]+    | 73    | 0.04635  |   |
| 26 | [M+H-CO2]+   | 68    | 0.04318  |   |
| 27 | [M+H-CH2O2]+ | 60    | 0.03810  |   |
| 28 | [M-H-NH3]-   | 59    | 0.03746  |   |
| 29 | [M.Cl]-      | 56    | 0.03556  |   |
| 30 | [M+Li]+      | 49    | 0.03111  |   |

The text was updated successfully, but these errors were encountered:

awvwgk · 2020-02-18T21:48:04Z

Thanks, this looks interesting. Everything based on cationic species (H, alkaline ions) should be easily doable with crest or is already possible (15 of 30 species in the above list). This might require performing multiple protonation and/or deprotonation steps and maybe removing an electron by hand (22), but it is doable.

Most of the leftover species are results of fragmentations or dissociations, which are, to my knowledge, removed explicitly from the protonation/deprotonation procedure right now. So they might already get generated in the MTD step, but are sorted out due to fragmentation.

Generating one ensemble of ESI-MS protomers is somewhat difficult as we would be dealing with a grand-canonical ensemble, but I fail to see an easy way to allow particle exchange in such a search procedure. I guess generating it in pieces and putting everything together in the end would be the way to go.

pprcht · 2020-02-19T10:50:46Z

Currently adding other ions is possible if they consist only out of a single atom, e.g., alkaline or earth-alkaline ions. To do that there is an additional cmd flag that has to be used together with the '-protonate' command. This flag is called '-swel' (short for 'switch element') and requires the corresponding ion including its charge as a second argument. I.e., something like '-swel na+' or '-swel mg2+' will be read and parsed.
There is one example of this in the main publication (https://pubs.rsc.org/en/content/articlehtml/2020/cp/c9cp06869d).

Adding more complex ions consisting out of several atoms, e.g., something like NH4+, is currently not possible in an automated way.

tobigithub · 2020-02-20T02:18:09Z

@pprcht @awvwgk
great, thank you, that works fine.

When I use it on the crest xtb example from alanylglycine I nicely get the sodiated adduct.

>cat crest_best.xyz
  20
        -33.86165696
 C         -2.1627314617       -0.1093305049       -0.3179184038
 C         -3.0386798794        0.1551215621        0.9094452568
 H         -2.5609217993       -0.2489766103        1.8005935960
 H         -3.1615103731        1.2257149903        1.0437787263
 H         -4.0170366784       -0.3067699909        0.7916623873
 N         -1.8297221481       -1.5125490956       -0.5338860442
 H         -1.4019637414       -1.8949461549        0.3054021711
 H         -2.6723252934       -2.0471788886       -0.7152017160
 H         -2.6862009673        0.2430998080       -1.2167434450
 C         -0.8715535551        0.7118857591       -0.2176704482
 O         -0.8068736164        1.7904607650        0.3323591016
 N          0.1704837182        0.1380148853       -0.8597669072
 H          0.0313935031       -0.8169399754       -1.1652573537
 C          1.5097913329        0.6300911148       -0.7040998402
 C          2.3317937909       -0.2766511735        0.1917997439
 O          1.9237529964       -1.2763293930        0.7161968811
 O          3.5851947016        0.1635610032        0.3244267958
 H          4.0943032987       -0.4202945400        0.9094608424
 H          1.4441575213        1.6202042580       -0.2389209338
 H          2.0163379624        0.7297480352       -1.6694825308



>crest crest_best.xyz -protonate -T 16  -ewin 10000 -iter 1000 -swel na+

===================================================
============= ordered structure list [Na]+ ========
===================================================
 written to file <protonated.xyz>

 structure    ΔE(kcal/mol)   Etot(Eh)
    1            0.00        -33.773504
    2            0.63        -33.772494
    3            1.72        -33.770764
    4           10.16        -33.757305


>crest crest_best.xyz -protonate -T 16  -ewin 10000 -iter 1000

===================================================
============= ordered structure list [H]+ =========
===================================================
 written to file <protonated.xyz>

 structure    ΔE(kcal/mol)   Etot(Eh)
    1            0.00        -33.960179
    2            0.83        -33.958853
    3            3.06        -33.955296
    4           22.96        -33.923597
    5           27.49        -33.916373
    6           55.14        -33.872305

Classic tools also generate additional protomers or adducts in this case which could have even lower energy. I have to check which molecules are thrown out during the three cycles. Maybe that's when rule based system and AIMD (similar to this paper) have to be married or at least considered for some cases. Basically feeding the rule based output into crest.

Source: https://cactus.nci.nih.gov/tautomerizer/
Input: "CC@HC(\NCC(O)=O)=[O+]/[Na]"

Plus spectroscopic methods or mass spectrometry can be used to confirm or investigate some of these compounds (similar to this paper) where tautomers of different dipeptides were investigated with tandem mass spectrometry. The referenced paper also shows possible ring formations.

Overall I think crest is a "killer tool" which really will improve productivity, because who has time to test hundreds of different possibilities, especially for larger molecules. Plus CREGEN gives us ensemble outputs, really love it!

awvwgk transferred this issue from grimme-lab/xtb Apr 23, 2020

This comment has been minimized.

Sign in to view

pprcht closed this as completed Sep 21, 2020

yanfeiguan mentioned this issue Nov 11, 2022

QCG: Conda CREST fails #154

Closed

Msafy2021 mentioned this issue Feb 14, 2023

Error when Running crest for some metal complexes #171

Closed

priyam1720 mentioned this issue Mar 3, 2023

getting error while doing a conformational analysis of 97 atom molecule in crest with gfn2-xtb #180

Closed

matteo-maria-tommasini mentioned this issue Mar 13, 2023

file qcg_tmp/tmp_MTD/crest_conformers_0.xyz gets corrupt and fills all available disk space #178

Closed

Bitumelourd mentioned this issue Apr 27, 2023

Error when running CREST on TS search #195

Closed

AndreySchadel mentioned this issue Aug 21, 2024

Problems with msreact #332

Closed

moabe84 mentioned this issue Aug 22, 2024

Intel MKL ERROR: Parameter 6 was incorrect on entry to DLASWP #285

Open

This was referenced Sep 13, 2024

Initial geometry optimization failed #313

Open

Crest Error: Failed Initial geometry optimization #341

Open

adamhorvath99 mentioned this issue Oct 24, 2024

Unexpected Fortran runtime error associated with legacy algos #364

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional adduct forms for crest #2

Additional adduct forms for crest #2

tobigithub commented Feb 18, 2020

awvwgk commented Feb 18, 2020 •

edited

Loading

pprcht commented Feb 19, 2020

tobigithub commented Feb 20, 2020 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

Additional adduct forms for crest #2

Additional adduct forms for crest #2

Comments

tobigithub commented Feb 18, 2020

awvwgk commented Feb 18, 2020 • edited Loading

pprcht commented Feb 19, 2020

tobigithub commented Feb 20, 2020 • edited Loading

This comment has been minimized.

This comment has been minimized.

awvwgk commented Feb 18, 2020 •

edited

Loading

tobigithub commented Feb 20, 2020 •

edited

Loading