Invited Lecture
STATISTICAL METHODS AND MACHINE LEARNING OF DIFFRACTION DATA FOR MICROSTRUCTURE ANALYSIS


Alexander Eggeman 1 Ben Martineau 2 Paul Midgley 2
1School of Materials, University of Manchester, Manchester, UK
2Department of Materials Science and Metallurgy, University of Cambridge, Cambridge, UK

Data science approaches have been successfully applied to a variety of electron microscopy measurements, resulting in highly detailed analysis of composition, electronic structure, crystallography and atomic structure of complex materials systems. The goal in all of these approaches is to take a large set of measurements from a sample and then to utilise the redundancy in the data to recover a model, or set of significant factors that can be associated with the physically distinct components in the material. In the case of diffraction data the goal is to identify the unique diffraction pattern associated with each component or region within a microstructure and which can be used to characterise that component.

Statistical decompositions are one widely used method for this, using matrix factorisation methods to isolate those particular signals that describe the majority of the structured part of the data as efficiently as possible. This has proven a valuable approach in the analysis of scanning diffraction data, especially in the fairly common situation where individual diffraction patterns can contain information about one or more overlapping phases in the microstructure being studied. The value of these approaches for analysing both simulated as well as experimental data will be addressed.

Further to this, the decomposition stage can also be thought of as a means of lowering the dimensionality of the data-analysis problem. This allow new approaches such as C-means or ‘fuzzy’ clustering to be applied. In high dimensional data the `distance` between two points (measurements) can become quite a complicated problem, whereas in low dimensions this becomes a more understandable quantity. Using such distance metrics allows like and unlike patterns to be distinguished via computational means alone. The distribution , shape and interaction between these clusters of measurements offers not only a way to improve the separation of overlapping signal but also a tantalising possibility of a machine learnable method for identifying different microstructural features.

Alexander Eggeman
Alexander Eggeman
University of Manchester