see https://sites.google.com/site/deepernn/home/blog/amistakeinwangchoberthasamouthanditmustspeakbertasamarkovrandomfieldlanguagemodel for the description of a mistake in the paper. BERT seems to be a non-equilibrium language model, not an MRF language model.
see https://arxiv.org/abs/1902.04094 for details.
@article{wang2019bert,
title={BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model},
author={Wang, Alex and Cho, Kyunghyun},
journal={arXiv preprint arXiv:1902.04094},
year={2019}
}