Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking manual validation #216

Open
kjappelbaum opened this issue Nov 2, 2022 · 6 comments
Open

Tracking manual validation #216

kjappelbaum opened this issue Nov 2, 2022 · 6 comments

Comments

@kjappelbaum
Copy link
Owner

Ran the current master on the RSM MOFs.

@kjappelbaum
Copy link
Owner Author

kjappelbaum commented Nov 2, 2022

has_undercoordinated_alkali_alkaline

flagged 26 compounds 'RSM1229', 'RSM2019', 'RSM2351', 'RSM2185', 'RSM3014', 'RSM1327', 'RSM1917', 'RSM2843', 'RSM2336', 'RSM2231', 'RSM3969', 'RSM1041', 'RSM1484', 'RSM1596', 'RSM2841', 'RSM1847', 'RSM0535', 'RSM1293', 'RSM1342', 'RSM1712', 'RSM3328', 'RSM0498', 'RSM1227', 'RSM2109', 'RSM2257', 'RSM1085'

manually opened all. I would like a tool flags all those cases to me - all seem suspicious. Some are very off, e.g. RSM2843

@kjappelbaum
Copy link
Owner Author

kjappelbaum commented Nov 2, 2022

has_lone_molecule

flags (145) many compounds 'RSM3042', 'RSM1706', 'RSM0117', 'RSM3853', 'RSM0910', 'RSM2350', 'RSM3812', 'RSM0879', 'RSM1100', 'RSM1403', 'RSM1116', 'RSM0357', 'RSM2014', 'RSM3605', 'RSM3367', 'RSM3275', 'RSM1272', 'RSM2777', 'RSM4027', 'RSM3127', 'RSM1264', 'RSM0025', 'RSM2624', 'RSM3035', 'RSM1459', 'RSM0423', 'RSM4163', 'RSM1623', 'RSM1273', 'RSM1434', 'RSM1967', 'RSM2473', 'RSM1926', 'RSM2445', 'RSM1117', 'RSM4587', 'RSM0878', 'RSM2994', 'RSM2644', 'RSM0291', 'RSM1300', 'RSM2469', 'RSM0287', 'RSM2570', 'RSM2618', 'RSM2136', 'RSM3232', 'RSM4133', 'RSM2622', 'RSM2710', 'RSM1887', 'RSM0916', 'RSM0691', 'RSM4216', 'RSM0652', 'RSM3438', 'RSM0078', 'RSM1901', 'RSM2443', 'RSM1280', 'RSM2951', 'RSM0038', 'RSM3614', 'RSM3028', 'RSM2639', 'RSM3842', 'RSM2438', 'RSM3415', 'RSM1326', 'RSM0576', 'RSM3161', 'RSM0937', 'RSM2459', 'RSM4237', 'RSM0133', 'RSM2323', 'RSM2636', 'RSM0871', 'RSM0172', 'RSM4424', 'RSM3230', 'RSM0277', 'RSM2749', 'RSM1535', 'RSM2460', 'RSM1427', 'RSM3528', 'RSM4255', 'RSM0312', 'RSM1201', 'RSM0152', 'RSM1305', 'RSM3768', 'RSM2657', 'RSM0390', 'RSM0685', 'RSM3047', 'RSM1996', 'RSM4444', 'RSM1492', 'RSM1191', 'RSM0104', 'RSM3753', 'RSM1781', 'RSM2007', 'RSM3568', 'RSM4254', 'RSM4346', 'RSM3362', 'RSM2932', 'RSM0149', 'RSM1867', 'RSM2532', 'RSM3909', 'RSM0558', 'RSM3860', 'RSM4167', 'RSM2699', 'RSM0990', 'RSM2515', 'RSM3451', 'RSM0338', 'RSM4551', 'RSM0142', 'RSM0545', 'RSM3871', 'RSM2626', 'RSM1724', 'RSM4199', 'RSM4530', 'RSM0123', 'RSM0363', 'RSM1460', 'RSM1071', 'RSM1420', 'RSM2324', 'RSM2631', 'RSM1363', 'RSM0456', 'RSM0250', 'RSM3305', 'RSM0211', 'RSM0092', 'RSM0753', 'RSM0929'

Manual inspection reveals many Ag-N compounds, which look "strange" in the ball-stick-view in VESTA (this is probably also often due to charge issues, e.g. RSM0929 or RSM0092 - in latter, it would have been also easy to find out from the original file in the CSD as there is perchlorate. Also DUWRIY misses a charge. RSM0456 is also wrong - the CSD entry has a charged carboxy in the linker).
Others certainly have floating molecules, e.g. RSM0149, RSM1706 is also funny, RSM3853 also looks weird after optimization (probably due to charge issues)

Overall, there might be some false positives but all structures I opened were not trivially correct to me on first glance.

@ElMouba
Copy link

ElMouba commented Nov 3, 2022

The RSM folder contains 3400 structures maybe and flagging around 10% of the structures is still good right ?
It is like 90% of the structures are still in "good shape". I need to check which of those structures we have used so far, because we didn't generate the isotherms for all RSM structures yet

@kjappelbaum
Copy link
Owner Author

kjappelbaum commented Nov 3, 2022

well, I have only manually checked two types of checks so far. If you look at all flags there are 447 in 3152 I ran this on. And "good" and "bad" is in the eye of the beholder :)
I use this issue to go over the remaining checks and to see if we can expect many false positives there.

@ElMouba
Copy link

ElMouba commented Nov 3, 2022

Ohhh I see. Let me then help with the rest and see what we get

@kjappelbaum
Copy link
Owner Author

kjappelbaum commented Nov 14, 2022

Manually identified charged MOFs

  • RSM3317 as basically all Ag-N compounds (which typically have suspiciously long Ag-N distances after the DFT). We should be able to find those now reliably with oximachine + moffragmentor as we do not even have to count linker charges . We could have also identified this from the chemical name (tetrakis(hexafluorophosphate) in name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants