Skip to main content

Machine Learning and Data Science

We develop and apply machine learning and, more generally, data science approaches both in the context of large-scale atomistic simulations and for imaging and characterization of nanoscale systems.

Machine learning includes automated learning methods such as genetic algorithms and neural networks. By combining machine learning with methods from data science such as pattern recognition, computer vision and multi-objective optimization, we can carry out and analyze efficient atomistic simulations, as well as develop atomistic models from multimodal imaging data.

Genetic algorithms, in conjunction with other optimization approaches are used to develop efficient potential energy surface models or force fields (FFs) for use in molecular dynamics (MD) simulations. Inputs for the FF model can include both empirical and first-principles information. Successful FFs for novel 2D materials such as stanene have been developed, and recently a highly efficient and accurate FF for water has been developed, allowing mesoscopic mechanisms (e.g. ice-grain formation) to be elucidated.

We have developed a framework for connecting data science/machine learning techniques, multi-modal imaging and first-principles and atomistic modeling. It allows the integration of multiple experimental characterization results and first- principles calculations via a common data-science-based platform. This platform involves multi-objective GA, scale-invariant feature transform, compressive sensing, and dictionary learning.

The approach allows, for example,  3-D atomistic structures not explicitly evident in the experimental data to be inferred.