Here is an overview of the DeePMD-kit algorithm. Given a specific centric atom, we can obtain the matrix describing its local environment, named embedding net
, so the environment matrix fitting net
to get the predicted energy embedding net
and fitting net
relying on the atom type, resulting in embedding net
and one fitting net
in total, which reduces training complexity largely.
In the following chart, you can find the meaning of symbols used to clarify the atom-type embedding algorithm.
nchanl
So, we can formulate the training process as follows. Vanilla DeePMD-kit algorithm:
DeePMD-kit applying atom type embedding:
or
The difference between the two variants above is whether using the information of centric atom when generating the descriptor. Users can choose by modifying the type_one_side
hyper-parameter in the input JSON file.
A detailed introduction can be found at se_e2_a_tebd
. Looking for a fast start-up, you can simply add a type_embedding
section in the input JSON file as displayed in the following, and the algorithm will adopt the atom type embedding algorithm automatically.
An example of type_embedding
is like
"type_embedding":{
"neuron": [2, 4, 8],
"resnet_dt": false,
"seed": 1
}
Atom-type embedding can be applied to varied embedding net
and fitting net
, as a result, we build a class TypeEmbedNet
to support this free combination. In the following, we will go through the execution process of the code to explain our code modification.
In trainer.py, it will parse the parameter from the input JSON file. If a type_embedding
section is detected, it will build a TypeEmbedNet
, which will be later input in the model
. model
will be built in the function _build_network
.
When building the operation graph of the model
in model.build
. If a TypeEmbedNet
is detected, it will build the operation graph of type embed net
, embedding net
and fitting net
by order. The building process of type embed net
can be found in TypeEmbedNet.build
, which output the type embedding vector of each atom type (of [$\text{ntypes} \times \text{nchanl}$] dimensions). We then save the type embedding vector into input_dict
, so that they can be fetched later in embedding net
and fitting net
.
In embedding net
, we shall take local environment
build -> _pass_filter -> _filter -> _filter_lower
_pass_filter
: It will first detect whether an atom type embedding exists, if so, it will apply atom type embedding algorithm and doesn't divide the input by type.
_filter
: It will call _filter_lower
function to obtain the result of matrix multiplication (
_filter_lower
: The main function handling input modification. If type embedding exists, it will call _concat_type_embedding
function to concat the first column of input type_one_side
(if set True, then we only use the vector of the neighbor atom). The modified input will be put into the fitting net
to get
In fitting net
, it takes the descriptor vector as input, whose dimension is [natoms, atype_embed
(of dim [natoms, nchanl]), in which each row is the type embedding vector of the specific centric atom. The input is sorted by type of centric atom, we also know the number of a particular atom type (stored in natoms[2+i]
), thus we get the type vector of the centric atom. In the build phase of the fitting net, it will check whether type embedding exists in input_dict
and fetch them. After that, call embed_atom_type
function to look up the embedding vector for the type vector of the centric atom to obtain atype_embed
, and concat input with it ([input, atype_embed]). The modified input goes through fitting
net` to get predicted energy.
:::{note} You can't apply the compression method while using atom-type embedding. :::