Abstract: X-ray absorption spectroscopy (XAS) is a robust characterization technique to probe local environments, oxidation states, and electronic states. Conventional interpretation of XAS data requires a set of experimental fingerprints but the paucity of such data poses challenges to efficient and effective analysis of XAS. Computational spectroscopy is a widely adopted alternative to experimental reference spectra. With decent maturity in spectroscopy mythologies and computing power, computational packages are able to deliver spectra that are comparable to experimental ones with reasonable cost. This talk will introduce the world’s largest existing collection of computed XAS spectra to date, XASdb, that contains more than 540,000 K-edge and 140,000 L-edge XANES for over 40,000 unique materials.
This large-scale database also lays the foundation for data-hungry machine learning applications, which will be covered in two projects. The first project is the development and implementation of an Ensemble-Learned Spectra IdEntification (ELSIE) algorithm that leverages ensemble learning techniques to identify similar XANES spectra from XASdb. This spectral matching algorithm allows any user to compare multiple X-ray absorption spectra and find matches within the XASdb for an uploaded spectrum. The second project is to apply machine learning models to the prediction of chemical environment from XAS spectra. Random forest models stand out in this multi-label classification task and exhibit decent prediction accuracy. These findings indicate that the combination of the XASdb with these machine learning techniques will be an invaluable resource to the materials research community by greatly enhancing the efficiency at which experimental XAS spectra can be analyzed.