-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix setup and add new schema #1
base: develop
Are you sure you want to change the base?
Conversation
Thanks a lot @JosePizarro3 - really appreciate your continued help with this! The restructuring looks fine to me. The new schema (atoms state thing) will be much easier for us to work with I suspect and in general makes a lot of sense. In terms of contributing to nomad-simulations, we are absolutely keen. The magres format spec document was already some effort to defining a standard schema for computed magres data for the community. I would be very happy to see it adapted for nomad generally. I count >13 periodic electronic structure codes that can calculate magnetic shielding tensors (one of the key properties) so having a general schema that other parsers can work with makes sense to me. I've manually added @Sathya-S3 now and I think he's in a better position to review the specific code changes having tested out the initial version. |
Great, thanks @jkshenton !
Indeed, the magres format is a very easy case where, whatever you decided for the properties should be incorporated directly, while for the symmetry open questions, we should work a bit more on it. |
Added nomad-lab[infrastructure] extra dependencies Added workflow for NMR CASTEP or QE + magres entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work on presenting the magres parser in such a polished state, Jose. My main comments are related to understanding the workflow links between CASTEP and magres files, which I have highlighted with the relevant code. I'm happy to approve the pull request after we have had a chance to chat with you about the workflow linking and how we can help with further magres parser development (Quantum Espresso workflows etc.).
I have integrated the magres parser as guided by the README instructions with a NOMAD Oasis instance and have verified that the file parsing works flawlessly. The test was performed with an ethanol CASTEP-magres workflow files archive.
"Documentation" = "https://nomad-coe.github.io/nomad-parser-magres/" | ||
"Homepage" = "https://github.com/CCP-NC/nomad-parser-magres" | ||
"Bug Tracker" = "https://github.com/CCP-NC/nomad-parser-magres/issues" | ||
"Documentation" = "https://CCP-NC.github.io/nomad-parser-magres/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation page "https://CCP-NC.github.io/nomad-parser-magres/" does not exist at the time of review. I assume this will be auto-rectified when the changes are merged into "master branch". Will run a quick verification after merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we solved this last time we talk, the docu page has to be deployed in the settings of the repo. Check this: https://www.mkdocs.org/user-guide/deploying-your-docs/#github-pages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkshenton for your attention
self, nmr_first_principles_archive: 'EntryArchive' | ||
): | ||
""" | ||
Automatically parses the NMR Magres workflow. Here, `self.archive` is the | ||
NMR magres archive in which we will link the original NMR first principles (CASTEP | ||
or QuantumESPRESSO) entry. | ||
|
||
Args: | ||
nmr_first_principles_archive (EntryArchive): the NMR (first principles) CASTEP or QuantumESPRESSO archive. | ||
""" | ||
workflow = NMRMagRes(method=NMRMagResMethod(), results=NMRMagResResults()) | ||
workflow.name = 'NMR Magres' | ||
|
||
# ! Fix this once CASTEP and QuantumESPRESSO use the new `nomad-simulations` schema under 'data' | ||
# Method | ||
# method_nmr = extract_section(nmr_first_principles_archive, ['run', 'method']) | ||
# workflow.method.nmr_method_ref = method_nmr | ||
|
||
# Inputs and Outputs | ||
# ! Fix this to extract `input_structure` from `nmr_first_principles_archive` once | ||
# ! CASTEP and QuantumESPRESSO use the new `nomad-simulations` schema under 'data' | ||
input_structure = extract_section(self.archive, ['data', 'model_system']) | ||
nmr_magres_calculation = extract_section(self.archive, ['data', 'outputs']) | ||
if input_structure: | ||
workflow.m_add_sub_section( | ||
NMRMagRes.inputs, Link(name='Input structure', section=input_structure) | ||
) | ||
if nmr_magres_calculation: | ||
workflow.m_add_sub_section( | ||
NMRMagRes.outputs, | ||
Link(name='Output NMR calculation', section=nmr_magres_calculation), | ||
) | ||
|
||
# NMR (first principles) task | ||
# ! Fix this once CASTEP and QuantumESPRESSO use the new `nomad-simulations` schema under 'data' | ||
program_name = nmr_first_principles_archive.run[-1].program.name | ||
if nmr_first_principles_archive.workflow2: | ||
task = TaskReference(task=nmr_first_principles_archive.workflow2) | ||
task.name = f'NMR FirstPrinciples {program_name}' | ||
if input_structure: | ||
task.inputs = [Link(name='Input structure', section=input_structure)] | ||
if nmr_magres_calculation: | ||
task.outputs = [ | ||
Link( | ||
name='Output NMR calculation', | ||
section=nmr_magres_calculation, | ||
) | ||
] | ||
workflow.m_add_sub_section(NMRMagRes.tasks, task) | ||
|
||
self.archive.workflow2 = workflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, the workflow separates the .castep (NMR first principles calculation) file and .magres file resulting from the calculation as two individual entries, with both having display names " CASTEP NMR SinglePoint Simulation". Ideally, since the magres file is a direct output of the CASTEP calculation, it might be beneficial to include it directly to the outputs of the CASTEP first principles calculation in the workflow.
Will the "fix" mentioned in the code comments correspond to fixing how the workflow is processed currently? Or is it related only to the adoption of the new 'nomad-simulations' schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, after we talked: we agreed that the separation is somewhat an artifact, but we can keep it just in case someone uploads simply a magres file. In any case, I have to fix this workflow part; it is not working right now but I know how to handle it.
if outputs is not None: | ||
simulation.outputs.append(outputs) | ||
|
||
# Try to resolve the `entry_id` and `mainfile` of other entries in the upload to connect the magres entry with the CASTEP or QuantumESPRESSO entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the metadata section of file upload, there is a files section that lists all files included in the upload. Is this the only intended result? There is mention of linking the NMR workflow directly in the magres entry, could you please clarify how it is done? As per my previous comments, we see two separate file entries and workflows for the .castep and .magres files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are more to connect the castep with the magres entries
# | ||
from nomad.metainfo import SubSection, Quantity, Reference | ||
from nomad.datamodel.metainfo.simulation.method import Method | ||
from nomad.metainfo import Quantity, Reference, SubSection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This NMRMagRes class is defined to generate an extra EntryArchive if both NMR first principles and NMR magres SinglePoint EntryArchives are present. Could you please clarify how this works in linking the workflows as mentioned in the parser.py script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea's that the magres file creates an entry which connects with the CASTEP or QE one. In this entry, you will find all the details of the original DFT calculation + the output nmr properties.
The creation of such a workflow in the magres entry is just an organizational tool. This is actually what workflows should be about, just organizing in one way without cluttering entries which could be reuse for something else (in this case, the CASTEP or QE ones).
In any case, this can be changed later. We could decide to integrate the workflow better, hence just adding a few lines in the CASTEP and QE parsers to handle the possibility of a magres output file.
@Sathya-S3 I will do a final clean up next week and push again. I think I can integrate your comments. |
@jkshenton Sathya (sorry, I cannot tag him as he is not part of this repo),
I've been working on some changes in this repo. Maybe you can drop me a review so I can merge and add this new plugin into NOMAD? You can also add it then as part of the NOMAD OASIS and use it in your CCP-NC database.
I still need to:
tests/test_parser.py
Changes 1 - new structure
I've adopted the new structure for plugins from NOMAD. This means a lot of changes, but the most important are the ones in
pyproject.toml
and the new folder structure undersrc
. Some extra stuff:nomad
.Changes 2 - New schema
nomad-simulations
I've changed the parsing to the new schema,
nomad-simulations
. Some points here:Outputs
from thenomad-simulations
schema to extend it with NMR specific properties, and how I usedPhysicalProperty
from this package to define such NMR properties.nomad
ornomad-simulations
repo, as some other users might be using these anyways.nomad-simulations
repo. I will be happy if you want to contribute there and update these definitions in there. Other users might need them and it connects nicely with the taxonomy effort I comment on above.