I have spent some time wondering about the fundamental questions about crafting inputs to atomistic ML models, which are also closely tied to their architectures. More specifically, in terms of descriptors, what are the ingredients central to their mathematical representations? How can we incorporate equivariance with the physical symmetries underlying structures in 3D Euclidean space? How to ensure that two structures unrelated by symmetries are mapped to different descriptors? How can we capture long-range (Coulomb) interactions into machine learning models while keeping a local description of atomic environments? Another line of research has been on identifying the similarities and differences between models that rely on these descriptors as inputs and models that work directly with input structures (such as graph neural networks).
Related publications
2024
Expanding density-correlation machine learning representations for anisotropic coarse-grained particles
Arthur Lin, Kevin K Huguenin-Dumittan, Yong-Cheol Cho, Jigyasa Nigam, and Rose K Cersonsky