Skip to content

Machine-Learning-Based Interatomic Potentials for Catalysis: an Universal Catalytic Large Atomic Model

License

Notifications You must be signed in to change notification settings

lalaheihaihei/catalyticLAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CatalyticLAM

Machine-Learning-Based Interatomic Potentials for Catalysis: a Universal Catalytic Large Atomic Model

  1. Pre-trained Model
  2. Overview
  3. Installation
  4. Quick Usage
  5. License
  6. Acknowledgements
  7. Citation

0. Pre-trained Model

Our Pre-trained Models can be obtained in the following configurations saved in google driver:

Model training strategy Download val force MAE(meV/A) on metal system val energy MAE(meV/atom) on metal system
Gemnet-OC finetuned based on GemNet-OC-S2EFS-OC20+OC22 5 epoch best_checkpoint_GemnetOC.pt
config
34.5 4.05
equiformerV2 finetuned based on eq2_121M_e4_f100_oc22_s2ef.pt 2 epoch checkpoint_eqV2.pt
config
26.0 32.5
DPA2 finetuned based on DPA2_medium_28_10M_beta4.pt 2000000 steps model.ckpt-2000000.pt
config
154 484

1. Overview

1.1 Generation

This section is responsible for generating structures, including bulk and slab structures for VASP calculations, and generating initial adsorption structures. We preset a series of commonly used small adsorbates and place them on slab models with 1, 2, or 4 molecules.

1.2 VASP Workflow

This section manages VASP tasks and workflows, as well as collects data for dpdata. It can perform high-throughput optimization and molecular dynamics (MD) jobs based on pre-generated structures. It can also check the convergence of SCF steps in optimization and MD, and perform high-throughput conversion to LMDB or NPY format for further training.

1.3 Post Workflow

This section is responsible for model-accelerated structure optimization, transition state search, and catalytic reaction network construction. The optimization and transition state search are based on a local fine-tune method, which involves a Labeling, Fine-tuning, and Inference loop to accelerate the optimization and MD process. It can also automatically construct reaction networks to generate possible intermediates and transition state structures.

1.4 Pretrained CLAM for Post Workflow

This section contains the pretrained CLAM model for the post workflow, including its training, fine-tuning, and checkpoint files.

1.5 Scripts

This section contains some useful scripts to generate cluster structures and convert the format of files.

1.6 Initial Structures

This section contains the initial structure files and POSCAR files for VASP optimization and MD calculations to generate datasets.

2. Installation

2.1 Prerequisites

Ensure that your system has the following software installed:

  • Python 3 (version > 3.10)
  • ASE (version > 3.22)
  • Pymatgen (version > 2023.3.23)
  • DeePMD-kit (version > 3.0.0a1)
  • dpdata (version > 0.2.18)
  • fairchem
  • VASP (version > 5.4.4)
  • VASPKIT
  • SLURM (for job scheduling)
  • tqdm (for progress bars)

2.2 Installation

  1. Clone the repository:
git clone https://github.com/lalaheihaihei/catalyticLAM.git
  1. Enter the folder:
cd catalyticLAM

3. Quick Usage

3.1 Generate Structures

Navigate to the generation directory and run the appropriate script to generate the desired structures:

  • get-bulk.py: Generates bulk structures.
  • get-slab.py: Generates slab structures.
  • element_list.json: Metal and alloy elements for bulk generation.
  • material.json: Information for database generation.
  • molecule.json: Molecular structures database.
cd generation
python get-bulk.py --api-key Your-Api-Key --bulktype metal --elementNumber 1 --task search --ificsd
python get-bulk.py --plot --api-key Your-Api-Key --min-lw 10.0 --task generate
python get-slab.py --plot --api-key Your-Api-Key --molecule-type CO --up-down UUD --element Au --type type1
python get-slab.py --plot --api-key Your-Api-Key --molecule-type all --up-down UUUUDDDD --element Pd --type type3

Detailed usages are seen in README.md.

3.2 Run VASP Workflow

Navigate to the vaspworkflow directory, edit the input file as needed, and run flow.py:

  • flow.py: Main workflow script for managing VASP calculations and data processing.
  • input: Input parameter file.
  • POSCAR: Directory containing various POSCAR files and their corresponding VASP calculation results.
  • structure_db: Stores the structure database.
  • utils: Contains configuration files and scripts required for VASP calculations.
cd vaspworkflow
nohup python flow.py POSCAR opt &
python flow.py POSCAR optcheck
nohup python flow.py POSCAR md &
python flow.py POSCAR mdcheck
python flow.py POSCAR dpdata
python flow.py POSCAR plot

Detailed usages are seen in README.md.

3.3 Run Structure Optimization and Transition State Search

Navigate to the postworkflow directory, prepare input files, and run the relevant scripts:

  • flowopt.py: Workflow script for structure optimization.
  • flowts.py: Workflow script for transition state search.
  • POSCAR: Initial structure file.
  • utils: Contains configuration files and scripts for optimization and transition state search.
cd postworkflow
cd optdp or optoc
nohup python ./flowopt.py --num_iterations 3 --steps_per_iteration 200 --fixed_atoms 0 --iffinal true --fmax 0.1 &
cd tsdp or tsoc
nohup python ./flowts.py POSCARis POSCARfs ./frozen_model.pth OUTCARis OUTCARfs &

Detailed usages are seen in README.md.

3.4 Run Reaction Network generation

Navigate to the postworkflow/RNET directory, prepare input files, and run the relevant scripts:

  • RNet.py: Genarate reaction network diagram.
  • MakeSlab.py: Construct all possible structures for intermediats adsorption on metal surfaces.
  • plot_all.py: Plot the energy changes and energy differences MAE.
cd postworkflow/RNET
python RNet.py 1 2 --layout spring
python MakeSlab.py --element Pt --max-index 1

Detailed usages are seen in README.md.

3.5 Construct Pretrained CLAM for Post Workflow

Navigate to the train directory, edit the input files, and run the training or fine-tuning jobs. Details of CLAM are in README.md

dp --pt train input.json > out
dp --pt train --finetune model.ckpt.10000000.pt --model-branch <head> finetune.json > out (At present, the head name is only supported for oc22, qm and metal)
python main.py --mode train --config-yml finetune1.yml --print-every 1000 >> out
python main.py --mode train --config-yml finetune1.yml --checkpoint gnoc_oc22_oc20_all_s2ef.pt --print-every 1000 >> out

Detailed usages are seen in README.md.

More information please refer to Deepmd-kit official website and fairchem official website.

3.6 Other Scripts

Navigate to the scripts directory and run the appropriate script to generate the cluster structures or convert file formats.

  • cif2pos.py: Convert CIF file to POSCAR.
  • get-cluster.py: Generate the structures of metal clusters in xyz format.
  • json2cif.py: Convert JSON file to CIF file.
  • xyz2pos.py: Convert XYZ file to POSCAR.
  • sim_model.py: For deleting the unnecessary keys in checkpoint files (oc22).
  • cal_nframes.py: Calculate the number of frames in a dataset with dp (deepmd-kit) format.
  • make_test.py: Make a dataset test with lmdb format.

More details are seen in README.md.

3.7 Initial structures

Navigate to the structure_db directory, you can find compressed files, which containing the initial structures.

  • 2D.tgz: The total 6351 POSCAR files of 2D materials for VASP calculation.
  • 2D-raw.tgz: The initial json file containing the information of 2D materials and the corresponding cif files.
  • bulk.tgz: The POSCAR files of metals and alloys for VASP calculation.
  • cluster.tgz: The POSCAR files of clusters for VASP calculation.
  • cluster-raw.tgz: The initial xyz files of clusters.
  • molecule.tgz: The total POSCAR files of molecules for VASP calculation.
  • molecule-raw.tgz: The initial xyz files of molecules.
  • slab.tgz: The POSCAR files of slabs for VASP calculation.

4. License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.

5. Acknowledgements

6. Citation

Please cite the works below if this repository is helpful.

Wu Z, Zhou L, Hou P, Liu Y, Guo T, Liu J-C. Catalytic Large Atomic Model (CLAM): A Machine-Learning-Based Interatomic Potential Universal Model. ChemRxiv. 2024; doi:10.26434/chemrxiv-2024-2xzct 

About

Machine-Learning-Based Interatomic Potentials for Catalysis: an Universal Catalytic Large Atomic Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published