Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SS-179: Add Wasabi To Global-Chem #310

Open
Sulstice opened this issue Jun 10, 2024 · 6 comments
Open

SS-179: Add Wasabi To Global-Chem #310

Sulstice opened this issue Jun 10, 2024 · 6 comments

Comments

@Sulstice
Copy link
Collaborator

Sulstice commented Jun 10, 2024

Add Wasabi to Global-Chem with the following Papers:

Read these papers:
1.) Kang, J.-H. et al. Wasabia japonica is a potential functional food to prevent colitis via inhibiting the NF-κB signaling pathway. Food Funct. 8, 2865–2874. https://doi.org/10.1039/C7FO00576H (2017).
2.) Miles, C. & Chadwick, C. Growing wasabi in the Pacific Northwest. Farming the Northwest; Washington State University; USA; PNW0605; WSU: 2008; pp. 1–12.
3.) Kojima, M., Uchida, M. & Akahori, Y. Studies on the volatile components of Wasabia japonica, Brassica juncea and Cocholearia armoracia by gas chromatography-mass spectrometry. I. Determination of low mass volatile components. Yakugaku Zasshi 93, 453–459. https://doi.org/10.1248/yakushi1947.93.4_453 (1973).
4.) Etoh, H. et al. ω-methylsulfinylalkyl isothiocyanates in wasabi, Wasabia japonica Matsum. Agric. Biol. Chem. 54, 1587–1589. https://doi.org/10.1080/00021369.1990.10870168 (1990).
5.) Kumagai, H. et al. Analysis of volatile components in essential oil of upland Wasabi and their inhibitory effects on platelet aggregation. Biosci. Biotechnol. Biochem. 58, 2131–2135. https://doi.org/10.1271/bbb.58.2131 (1994).
6.) Hosoya, T., Yun, Y. S. & Kunugi, A. Five novel flavonoids from Wasabia japonica. Tetrahedron 61, 7037–7044. https://doi.org/10.1016/j.tet.2005.04.061 (2005).
7.) Kurata, T. et al. Isolation and identification of components from Wasabi (Wasabia japonica Matsumura) flowers and investigation of their antioxidant and anti-inflammatory activities. Food Sci. Technol. Res. 25, 449–457. https://doi.org/10.3136/fstr.25.449 (2019).
8.) Hosoya, T., Yun, Y. S. & Kunugi, A. Antioxidant phenylpropanoid glycosides from the leaves of Wasabia japonica. Phytochem. 69, 827–832. https://doi.org/10.1016/j.phytochem.2007.08.021 (2008).
9.) Yoshida, S., Hosoya, T., Inui, S., Masuda, H. & Kumazawa, S. Component analysis of Wasabi leaves and an evaluation of their anti-inflammatory activity. Food Sci. Technol. Res. 21, 247–253. https://doi.org/10.3136/fstr.21.247 (2015).
10.) Szewczyk, K. et al. Flavonoid and phenolic acids content and in vitro study of the potential anti-aging properties of Eutrema japonicum (Miq.) Koidz cultivated in Wasabi Farm Poland. Int. J. Mol. Sci. 22, 6219.
11.) Lohning, A. et al. 6-(methylsulfinyl) hexyl isothiocyanate (6-MITC) from Wasabia japonica alleviates inflammatory bowel disease (IBD) by potential inhibition of glycogen synthase kinase 3 beta (GSK-3β). Eur. J. Med. Chem. 216, 113250. https://doi.org/10.1016/j.ejmech.2021.113250 (2021).
12.) Shimamura, Y., Iio, M., Urahira, T. & Masuda, S. Inhibitory effects of Japanese horseradish (Wasabia japonica) on the formation and genotoxicity of a potent carcinogen, acrylamide. J. Sci. Food Agric. 97, 2419–2425.
13.) Morimitsu, Y. et al. Antiplatelet and anticancer isothiocyanates in Japanese domestic horseradish, wasabi. BioFactors 13, 271–276.
14.) Fuke, Y., Haga, Y., Ono, H., Nomura, T. & Ryoyama, K. Anti-carcinogenic activity of 6-methylsulfinylhexyl isothiocyanate-, an active anti-proliferative principal of wasabi (Eutrema wasabi Maxim.). Cytotechnology 25, 197–203.
15.) Dos Santos Szewczyk, Katarzyna, et al. “Chemical Composition of Extracts from Leaves, Stems and Roots of Wasabi (Eutrema Japonicum) and Their Anti-Cancer, Anti-Inflammatory and Anti-Microbial Activities.” Scientific Reports, vol. 13, no. 1, June 2023, p. 9142.

@ANUGAMAGE
Copy link
Collaborator

ANUGAMAGE commented Jun 24, 2024

Hi Sul I am going through these articles which are really interesting. Furthermore I am building a chemical list with the help of Nishan. Here I have attached the google dox link regarding the wasabi chemical list.

Link : https://docs.google.com/document/d/12jj4sPYLMemvLK8qHrORu2FpoodLBsbdTRmuCUXCw9Q/edit?usp=sharing

After budling this list I will add a node to Global-Chem with the help of Buden.

@ANUGAMAGE
Copy link
Collaborator

ANUGAMAGE commented Jul 13, 2024

@Sulstice Boss I assigned my juniors for this project. I will let them handle this work.

@kalana20751 @YAPAAS @sakeermr @Primali99
1.) Please go through above research articles related to Wasabi and find more if you like. Furthermore create a chemical list with isomeric smiles for each one like in below(expand the list) . Use PubChem as a data source.

'2-propenyl glucosinolate' : 'C=CC/C(=N/OS(=O)(=O)O)/S[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O'
'1-methylethyl glucosinolate' : 'CC(C)C(=NOS(=O)(=O)O)S[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O'

2.) Go through bellow link and complete the chart related to wasabi chemicals and it's functions as far as you guys can.
Link : https://docs.google.com/document/d/12jj4sPYLMemvLK8qHrORu2FpoodLBsbdTRmuCUXCw9Q/edit?usp=sharing

Please give me and update regarding to this project on next Sunday.

@kalana20751
Copy link
Collaborator

kalana20751 commented Jul 23, 2024

'6-O-caffeoylsucrose' :  r'C1=CC(=C(C=C1/C=C/C(=O)OCC2C(C(C(C(O2)OC3(C(C(C(O3)CO)O)O)CO)O)O)O)O)O '
'6-O-feruloylsucrose' : r' COC1=C(C=CC(=C1)/C=C/C(=O)OC[C@@H]2[C@H]([C@@H]([C@H]([C@H](O2)O[C@@]3([C@H]([C@@H]([C@H](O3)CO)O)O)CO)O)O)O)O '
'Ferulic Acid '              :     r' COC1=C(C=CC(=C1)/C=C/C(=O)O)O '
' 6-methylsulfinyl-hexyl glucosinolate' : ' CS(=O)CCCCCCC(=NOS(=O)(=O)O)S[C@H]1[C@@H]([C@H]([C@@H]([C@H](O1)CO)O)O)O '
' isovitexin 4’-O-glucoside ' : ' C1=CC(=CC=C1C2=CC(=O)C3=C(O2)C=C(C(=C3O)[C@H]4[C@@H]([C@H]([C@@H]([C@H](O4)CO)O)O)O)O)O[C@H]5[C@@H]([C@H]([C@@H]([C@H](O5)CO)O)O)O '
' luteolin 3',7' -diglucoside' : ' C1=CC(=C(C=C1C2=CC(=O)C3=C(C=C(C=C3O2)O[C@H]4[C@@H]([C@H]([C@@H]([C@H](O4)CO)O)O)O)O)O[C@H]5[C@@H]([C@H]([C@@H]([C@H](O5)CO)O)O)O)O '
' kaempferol 3-O-rutinoside ' : ' C[C@H]1[C@@H]([C@H]([C@H]([C@@H](O1)OC[C@@H]2[C@H]([C@@H]([C@H]([C@@H](O2)OC3=C(OC4=CC(=CC(=C4C3=O)O)O)C5=CC=C(C=C5)O)O)O)O)O)O)O'
'allyl isothiocyanate'         : ' C=CCN=C=S'
'6-methylsulfinylhexyl isothiocyanate' : ' CS(=O)CCCCCCN=C=S '
'caffeic acid'                      : r' C1=CC(=C(C=C1/C=C/C(=O)O)O)O '
'1,2'-di-O-trans-sinapoyl gentiobiose' : ' 
'Apigenin 8-C-glucoside' : 
' C[Si](C (C)OCC1C(C(C(C(O1)C2=C(C=C(C3=C2OC(=CC3=O)C4=CC=C(C=C4)O[Si](C)(C)C)O[Si](C)(C)C)O[Si](C)(C)C)O[Si](C)(C)C)O[Si](C)(C)C)O[Si](C)(C)C '
'p-Coumaric acid'         : r' C1=CC(=CC=C1/C=C/C(=O)O)O '
' n-Butyl isothiocyanate' : ' CCCCN=C=S'
' 3-Butenyl isothiocyanate' : ' C=CCCN=C=S'
'4-Pentenyl isothiocyanate' : ' C=CCCCN=C=S'
' 5-Hexenyl isothiocyanate' : ' C=CCCCCN=C=S'
'beta-Phenylethylisothiocyanate' : ' C1=CC=C(C=C1)CCN=C=S'
'5-Hexenyl isothiocyanate' : ' C=CCCCCN=C=S'
'7-Methylthioheptyl isothiocyanate' : ' CSCCCCCCCN=C=S' 
'5-Methylsulfinylpentyl isothiocyanate' : ' CS(=O)CCCCCN=C=S'
'7-Methylsulfinylheptyl isothiocyanate' : ' CS(=O)CCCCCCCN=C=S'
'Palmitic acid'                       : ' CCCCCCCCCCCCCCCC(=O)O '
'Linolenic acid'                     : r' CC/C=C\C/C=C\C/C=C\CCCCCCCC(=O)O'
'Oleic acid'                           : r' CCCCCCCC/C=C\CCCCCCCC(=O)O'
'Sinapinic acid'                   : r' COC1=CC(=CC(=C1O)OC)/C=C/C(=O)O '
'3,4-dimethoxy-trans-cinnamic acid' : r' COC1=C(C=C(C=C1)/C=C/C(=O)O)OC'

@ANUGAMAGE
Copy link
Collaborator

ANUGAMAGE commented Jul 30, 2024

@kalana20751 @YAPAAS @sakeermr @Primali99 This is really good. But you guys need to learn how to build smiles for molecules that smiles are unavailable. For that you can use this module. Install it in your colab space or use jupiter note book and give it a try

Link : https://github.com/Kohulan/Smiles-TO-iUpac-Translator

And use some of chem drawing softwares to confirm your smiles

@Sulstice
Copy link
Collaborator Author

Problems with your chemical list:

You need to convert the strings with the prefix r with a / character in them. Why? Clean up your strings so they don't look bad, it is harder for anyone to review.

'6-O-feruloylsucrose' : ' COC1=C(C=CC(=C1)/C=C/C(=O)OC[C@@H]2[C@H]([C@@H]([C@H]([C@H](O2)O[C@@]3([C@H]([C@@H]([C@H](O3)CO)O)O)CO)O)O)O)O '

'6-O-feruloylsucrose' : r'COC1=C(C=CC(=C1)/C=C/C(=O)OC[C@@H]2[C@H]([C@@H]([C@H]([C@H](O2)O[C@@]3([C@H]([C@@H]([C@H](O3)CO)O)O)CO)O)O)O)O '

1.) Are any of these already in Global-Chem. How many isomeric SMILES are recorded? Was there a decision made?

You should be reading the paper first:

image

Explain this figure to me,

1.) what even is SELFIES and why was it used?
2.) What is an encoder and decoder? Why are they used in SMILES?
3.) The models are trained on Google’s Tensor Processing Units, what are TPUs and how are they different from CPU/GPUs?

Answer all these before running code. It's important you understand theory rather than just running code. Anyone can do that easily in 1 minute.

@kalana20751
Copy link
Collaborator

kalana20751 commented Jul 31, 2024

1.SELFIES is a molecular notation system designed to represent chemical structures. It introduced in 2020. Important thing is SELFIES has more advantages than SMILES. It is robustness notation system and we can use it in ML. Because, when we consider about SMILES string, it can give invalid molecules. Also encoding and decoding of SMILES need complex parsing rules. But SELFIES give valid molecules and easy to use in generative models. Herewith I have attached a reference: https://www.sciencedirect.com/science/article/pii/S2666389922002069
2. Basically Encoder converts information from one format to another. Such as SMILES string converts to a SELFIES string. Decoder reverses that process of encoding by converting the information back to its original format. Such as SELFIES string back to a SMILES string. Important is, when we consider about SMILES, it should be encoded to SELFIES to use in AI model to reduce errors. And also we need to decode it to IUPAC names. therefore we use encoder and decoder.
3. TPU is stands for tensor processing unit. TPUs are basically the processors for machine learning workloads. CPU are doing sequential processing (one after another) and the GPUs can do parallel processing which is specially design for graphic processing. while TPUs are capable of doing tensor operations in high efficiency.
Google introduced TPUs and used them to Language translations and Google image searches now they are using it in their DeepMind projects such as AlphaFold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

6 participants