A Unified Framework for Data-Driven Atomistic Modeling
Author(s)
Lei, Xiangyun
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
With the advancements in computers, atomistic modeling has grown to be an important tool in many fields of science in recent decades. This work focuses on two important and popular classes of atomistic modeling methods: density functional theory (DFT) and molecular dynamics force fields (FF). They are considered together because they both follow a similar abstract formalism: both consist of a model space that is used to describe the local chemical environment, and a model that connects the model space to the local property of interest. Hence, they also face similar difficulties in their development. Neither of them have established strategies for systematically improving the quality of their model spaces, and both suffer from models that are not arbitrarily accurate. These are the challenges that this work is attempting to address.
In this dissertation, an innovative framework of describing local chemical environments, called the multipole (MP) descriptor family is proposed and examined. It possesses many desired properties such as being mathematically complete, systematically improvable, physically meaningful, and applicable to both electronic and atomic environments. As examples, three specific cases of the MP descriptor family, Heaviside step multipole (HSMP), Legendre polynomial multipole (LPMP) and Gaussian multipole (GMP) are formulated and used to build the model spaces for DFT and FF models. In addition, tools and protocols are developed for training models to connect the MP descriptors to energies for DFT and FFs. We focus on building data-driven machine learning (ML) models in this work, because ML has been shown to be a promising alternative to analytical models. These new tools include a near-uniform sampling algorithm and software packages for training and understanding the models. Also, an interactive visualization tool called Electrolens is developed to help scientists to explore and gain intuition about the model spaces and models of both DFT and FF.
Finally, based on the applications of the MP descriptors in electronic and atomic systems, a powerful framework called chemical environment modeling theory (CEMT) is proposed. CEMT offers a new perspective to generalize and unify the concepts of FF (atomistic and coarse-grained) and DFT, and shows that they are just two special cases of a continuum of methods. A data-driven approach to accelerate model training under this CEMT framework is also proposed. Together, the MP feature family and this new framework could help guide the next generation of data-driven atomistic modeling method
development.
Sponsor
Date
2021-08-06
Extent
Resource Type
Text
Resource Subtype
Dissertation