5 months ago

EGraFFBench: Evaluation of Equivariant Graph Neural Network Force Fields for Atomistic Simulations

Vaibhav Bihani; Utkarsh Pratiush; Sajid Mannan; Tao Du; Zhimin Chen; Santiago Miret; Matthieu Micoulaut; Morten M Smedskjaer; Sayan Ranu; N M Anoop Krishnan

Abstract

Equivariant graph neural networks force fields (EGraFFs) have shown great promise in modelling complex interactions in atomic systems by exploiting the graphs' inherent symmetries. Recent works have led to a surge in the development of novel architectures that incorporate equivariance-based inductive biases alongside architectural innovations like graph transformers and message passing to model atomic interactions. However, thorough evaluations of these deploying EGraFFs for the downstream task of real-world atomistic simulations, is lacking. To this end, here we perform a systematic benchmarking of 6 EGraFF algorithms (NequIP, Allegro, BOTNet, MACE, Equiformer, TorchMDNet), with the aim of understanding their capabilities and limitations for realistic atomistic simulations. In addition to our thorough evaluation and analysis on eight existing datasets based on the benchmarking literature, we release two new benchmark datasets, propose four new metrics, and three challenging tasks. The new datasets and tasks evaluate the performance of EGraFF to out-of-distribution data, in terms of different crystal structures, temperatures, and new molecules. Interestingly, evaluation of the EGraFF models based on dynamic simulations reveals that having a lower error on energy or force does not guarantee stable or reliable simulation or faithful replication of the atomic structures. Moreover, we find that no model clearly outperforms other models on all datasets and tasks. Importantly, we show that the performance of all the models on out-of-distribution datasets is unreliable, pointing to the need for the development of a foundation model for force fields that can be used in real-world simulations. In summary, this work establishes a rigorous framework for evaluating machine learning force fields in the context of atomic simulations and points to open research challenges within this domain.

Benchmarks

Benchmark	Methodology	Metrics
formation-energy-on-3bpa	BOTNet	MAE: 5
formation-energy-on-3bpa	Allegro	MAE: 4.13
formation-energy-on-3bpa	NequIP	MAE: 3.15
formation-energy-on-3bpa	MACE	MAE: 4
formation-energy-on-acetylacetone	Allegro	MAE: 0.92
formation-energy-on-acetylacetone	NequIP	MAE: 1.38
formation-energy-on-acetylacetone	MACE	MAE: 2
formation-energy-on-acetylacetone	BOTNet	MAE: 2
formation-energy-on-aspirin	BOTNet	MAE: 12.63
formation-energy-on-aspirin	Allegro	MAE: 14.36
formation-energy-on-aspirin	MACE	MAE: 13.79
formation-energy-on-aspirin	NequIP	MAE: 9.27
formation-energy-on-ethanol	BOTNet	MAE: 203.83
formation-energy-on-ethanol	MACE	MAE: 209.96
formation-energy-on-ethanol	Allegro	MAE: 6.94
formation-energy-on-ethanol	NequIP	MAE: 4.99
formation-energy-on-gete	MACE	MAE: 2670
formation-energy-on-gete	Allegro	MAE: 1009.4
formation-energy-on-gete	BOTNet	MAE: 3034
formation-energy-on-gete	NequIP	MAE: 1780.951
formation-energy-on-lips	MACE	MAE: 30
formation-energy-on-lips	NequIP	MAE: 165.43
formation-energy-on-lips	BOTNet	MAE: 28
formation-energy-on-lips	Allegro	MAE: 31.75
formation-energy-on-lips20	NequIP	MAE: 26.8
formation-energy-on-lips20	Allegro	MAE: 33.17
formation-energy-on-lips20	BOTNet	MAE: 24.59
formation-energy-on-lips20	MACE	MAE: 14.05
formation-energy-on-naphthalene	MACE	MAE: 161.74
formation-energy-on-naphthalene	NequIP	MAE: 2.66
formation-energy-on-naphthalene	Allegro	MAE: 5.82
formation-energy-on-naphthalene	BOTNet	MAE: 182.55
formation-energy-on-salicylic-acid	NequIP	MAE: 6.29
formation-energy-on-salicylic-acid	MACE	MAE: 165.29
formation-energy-on-salicylic-acid	BOTNet	MAE: 153.06
formation-energy-on-salicylic-acid	Allegro	MAE: 8.59

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning