Molecular dynamics (MD) simulations are a fundamental tool for studying processes in computational chemistry and biology on an atomic level. Molecular dynamics works by propagating atomic positions over time via Newton's equations of motion [1], which requires atomic forces at every timestep. To ensure good accuracy, these forces are often obtained from so-called ab initio molecular dynamics [2]. However, their slow runtimes severely limit these quantum chemical methods, especially for use cases in biology and materials science. In recent years, graph neural networks (GNNs) have shown particular promise in approximating and replacing these expensive methods, thereby accelerating MD simulations by multiple orders of magnitude. A properly trained GNN can thus unlock MD simulation at unprecedented scale and accuracy.
In this blog post we will do just that, in a short tutorial on how to train a state-of-the-art GNN (GemNet) [6] and run MD simulations with it. We then discuss why running actual MD is critical for GNN evaluation and why a regular, static test set is not enough. This final section is based on our recent MLST paper [3].
Train Your Own GemNet Potential
Training a GNN for MD is not much different from training a GNN for any other prediction task. The main difference is that we need to predict both the energy as a scalar regression target, and the forces. The forces are usually predicted as the negative gradient of the energy w.r.t. the positions, i.e. F=-dE/dx. Some models such as GemNet-dT predict forces directly to circumvent the overhead of doing two backpropagation passes. We do not recommend direct force predictions for MD simulations, since they lead to instabilities during simulations.
Since GNNs for MD are so similar to regular GNNs, training them is supported by many different libraries, such as PyG, DGL, TF-GNN, or the OC20 repository. In this post we will focus on the reference implementation of GemNet based on PyTorch and PyG. You can find it in the gemnet_pytorch GitHub repository, which provides a basic training notebook. This section is aligned with that notebook, so feel free to run along using the notebook!
We start by importing all required modules
import numpy as np
import yaml
import string
import ast
import random
import time
from datetime import datetime
from gemnet.model.gemnet import GemNet
from gemnet.training.trainer import Trainer
from gemnet.training.metrics import Metrics, BestMetrics
from gemnet.training.data_container import DataContainer
from gemnet.training.data_provider import DataProvider
import torch
from torch.utils.tensorboard import SummaryWriter
Then we load the configuration and set all values
with open('config.yaml', 'r') as c:
config = yaml.safe_load(c)
# For strings that yaml doesn't parse (e.g. None)
for key, val in config.items():
if type(val) is str:
try:
config[key] = ast.literal_eval(val)
except (ValueError, SyntaxError):
pass
num_spherical = config["num_spherical"]
num_radial = config["num_radial"]
...
The notebook also sets up some logging and creates directories, but we won’t go into that boilerplate here. Next, we initialize the model with
model = GemNet(
num_spherical=num_spherical,
num_radial=num_radial,
...
)
# push to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
and load the datasets:
data_container = DataContainer(
dataset, cutoff=cutoff, int_cutoff=int_cutoff, triplets_only=triplets_only
)
dataset contains the path to a Numpy .npz-file containing the five arrays:
N is an array containing the number of atoms in each sample
R is a concatenated sum(N)x3 array with the coordinates of each atom
Z is a concatenated sum(N) array with the atomic number of each atom
E contains the groundtruth energy of each sample
F is a concatenated sum(N)x3 array containing the groundtruth force acting on each atom
The dataset is loaded into a DataContainer that holds all the data and allows getting an arbitrary subset of samples. The DataProvider randomly samples batches from the container and converts these batches to a PyTorch (and TensorFlow) compatible form.
data_container = DataContainer(
dataset, cutoff=cutoff, int_cutoff=int_cutoff, triplets_only=triplets_only
)
For easy data handling we obtain a generator by calling the .get_dataset function:
# Initialize datasets
train = {}
validation = {}
train["dataset_iter"] = data_provider.get_dataset("train")
validation["dataset_iter"] = val_data_provider.get_dataset("val")
We next initialize the Trainer, which handles the loss, optimizer, learning rate decay and warmup, gradient clipping (for stability during training), and exponential moving averages of the model weights (EMA)
trainer = Trainer(
model,
learning_rate=learning_rate,
decay_steps=decay_steps,
...
)
We also initialize some metrics for tracking the training run
# Initialize metrics
train["metrics"] = Metrics("train", trainer.tracked_metrics)
validation["metrics"] = Metrics("val", trainer.tracked_metrics)
# Save/load best recorded loss (only the best model is saved)
metrics_best = BestMetrics(best_dir, validation["metrics"])
and load the latest checkpoint from disk, if it already exists
if os.path.exists(log_path_model):
# Load model, trainer, and metrics
model_checkpoint = torch.load(log_path_model)
model.load_state_dict(model_checkpoint["model"])
train_checkpoint = torch.load(log_path_training)
trainer.load_state_dict(train_checkpoint["trainer"])
metrics_best.restore()
else:
# Initialize metrics (model and trainer are already initialized)
metrics_best.inititalize()
Now we are set up to finally train the model. We removed some boilerplate in the following training loop. Note that the best model’s checkpoint is saved at best_path_model, while the latest (non-optimal) model weights and the additional trainer variables are saved at log_path_model and log_path_training. “Best” here refers to the model weights that achieved the best loss on the validation set.
for step in range(step_init + 1, num_steps + 1):
# Perform training step
trainer.train_on_batch(train["dataset_iter"], train["metrics"])
# Save checkpoint
if step % save_interval == 0:
torch.save({"model": model.state_dict()}, log_path_model)
torch.save(
{"trainer": trainer.state_dict(), "step": step}, log_path_training
)
# Check performance on the validation set
if step % evaluation_interval == 0:
# Backup the regular variables and load averaged (EMA) variables
trainer.save_variable_backups()
trainer.load_averaged_variables()
# Compute averages
for i in range(int(np.ceil(num_val / batch_size))):
trainer.test_on_batch(validation["dataset_iter"], validation["metrics"])
# Update and save best result
if validation["metrics"].loss < metrics_best.loss:
metrics_best.update(step, validation["metrics"])
torch.save(model.state_dict(), best_path_model)
# decay learning rate on plateau
trainer.decay_maybe(validation["metrics"].loss)
# Restore real variables
trainer.restore_variable_backups()
# early stopping
if step - metrics_best.step > patience * evaluation_interval:
break
After some time we can get and print our results on the validation set:
result = {key + "_best": val for key, val in metrics_best.items()}
for key, val in metrics_best.items():
print(f"{key}: {val}")
Run MD Simulations with GemNet
Now that we have our model, we can get to work and run MD simulations with it! The gemnet_pytorch GitHub repository offers an ASE calculator [4] (GNNCalculator) to evaluate energies and forces with GemNet potentials. It furthermore provides the machinery required to perform molecular dynamics simulations (MDSimulator) either in a NVE (microcanonical) or NVT (canonical) ensemble. Conveniently, the repository also contains pretrained models and an example notebook (ase_example.ipynb) for propagating molecules over time, which we will walk through next.
First, we import all necessary modules
from gemnet.model.gemnet import GemNet
from gemnet.model.utils import read_json
import ase.io
from ase_calculator import Molecule, MDSimulator
from ase.optimize import BFGS
# for visualization
from ase.io.trajectory import Trajectory
import nglview
Next, we define all model settings and load a pre-trained GemNet model. This model can be either your own potential or the model stored in the gemnet_pytorch repository.
model_name = "GemNet-Q" # or "GemNet-T"
pretrained_models_path = "./pretrained"
weights_file = f"{pretrained_models_path}/{model_name}/model.pth
model_kwargs = read_json(f"{pretrained_models_path}/{model_name}/model_kwargs.json")
model_kwargs["scale_file"] = f"{pretrained_models_path}/" + model_kwargs["scale_file"]
# Load the weights
model = GemNet(**model_kwargs)
model.load_weights(weights_file)
After loading the weights we next have to define an initial molecular geometry. To this end, we load a molecular configuration from a .xyz file using ASE. This file could e.g. stem from a database such as QM7x [5].
# Define the molecule
mol = ase.io.read(“test_mol.xyz”)
We can directly use the molecular geometry as an initial configuration for the MD simulation. However, this configuration might not be in equilibrium, so it might be good to first optimize it using the GemNet potential. To do this, we define an instance of the Molecule class with atomic positions R and atomic numbers Z as inputs. The Molecule class also takes GemNet-specific inputs like the cutoff, the int_cutoff and the triplets_only parameters. For more details on these latter parameters, see the documentation on Github.
# Optimize the geometry
R = mol.get_positions()
Z = mol.get_atomic_numbers()
cutoff = model_kwargs["cutoff"]
int_cutoff = model_kwargs["int_cutoff"]
triplets_only = model_kwargs["triplets_only"]
molecule = Molecule(
R, Z, cutoff=cutoff, int_cutoff=int_cutoff, triplets_only=triplets_only
)
Next, we define the settings for the geometry optimization and use the BFGS optimizer provided by ASE.
# Settings for optimization
traj_path = "./opt.traj"
logfile = "-" # “-” refers to standard output.
max_steps = 200 # Maximum number of simulation steps.
fmax = 0.05 # Convergence criterion
# Run the optimization
mol.calc = GNNCalculator(molecule, model=model, atoms=mol)
dyn = BFGS(mol, trajectory=traj_path)
dyn.run(fmax=fmax, steps=max_steps)
The last image of the dumped trajectory corresponds to a local minimum geometry. We can again use ASE to load this configuration. The index “-1” indicates that the last image of the trajectory is loaded. For that configuration, we define a new molecule instance.
# Read the last image of the trajectory
mol_opt = ase.io.read(f”{traj_path}@-1”)
R = mol_opt.get_positions()
Z = mol_opt.get_atomic_numbers()
molecule = Molecule(
R, Z, cutoff=cutoff, int_cutoff=int_cutoff, triplets_only=triplets_only
)
Then we define the MD settings and start the simulation. We use the MDSimulator class from the gemnet_pytorch repository, which is a wrapper of the ase.md
module. Note that we can either perform simulations in the NVE ensemble (setting the keyword dynamics=”verlet”
) or in the NVT ensemble using a Langevin thermostat (setting the keyword dynamics=”langevin”
). It should be fairly straightforward to extend the GNNcalculator to other ASE thermostats as well.
traj_path = "./md_sim.traj"
logfile = "-" # “-” refers to standard output.
dynamics = "langevin" # Name of the MD integrator. Implemented: 'langevin' or 'verlet'.
max_steps = 10 # Maximum number of simulation steps.
time = 0.5 # Integration time step for Newton's law in femtoseconds.
interval = 2 # Write only every <interval> time step to trajectory file.
temperature = 1500 # The temperature in Kelvin.
langevin_friction = 0.002 # Friction coefficient (only used when dynamics is langevin)
simulation = MDSimulator(
molecule, model,
dynamics=dynamics, max_steps=max_steps, time=time, temperature=temperature, langevin_friction=langevin_friction,
interval=interval, traj_path=traj_path, logfile=logfile
)
simulation.run()
We can visualize the overall MD trajectory by running the following commands:
traj = Trajectory(traj_path)
nglview.show_asetraj(traj)
Note that a restart of the MD simulation is also possible by setting the keyword vel to the velocities of the last image from the MD trajectory.
Testing GNNs for Molecular Dynamics
So why should you bother with running MD simulations when you just want to develop a GNN? After all, most of the current literature mainly tests GNNs on static benchmarks. To see the difference between a static test set and a dynamic MD simulation, let us look at our latest MLST paper [3]. In this paper, we explore the robustness of highly accurate GNNs in long and hot MD simulations of organic molecules. And we find that good results on the static reference dataset are often not enough.
As a first step, we trained GemNet [6] on varying subsets of the QM7x [5] database and evaluated force predictions on a test set. As expected, we obtained very low mean absolute errors (MAE) of around 5 meV / Ă…, outperforming previous models such as SpookyNet by ~3x. Interestingly, this was possible with ~13x less training data (320k molecules) and we did not see much improvement when increasing the data to 3.2M (see Fig. 1). However, the MAE on a test set is not the same as testing robustness in MD simulations.
So in the next step we directly ran MD simulations, stress-testing models with extreme simulation conditions, i.e. long simulation times and high temperatures. Despite good results on the original test set, GemNet models trained on small training sets (up to 32k molecules) performed poorly in these simulations and showed unphysical distortions during simulations even at low temperatures (see Fig. 2). Nevertheless, GemNet models trained on larger training sets were still very accurate and stable, despite the long and hot MD simulations. This was even the case for unseen molecules much larger than those in the training set. The main errors we observed for these were systematic and most likely explainable by unlearned long-range interactions in these out of distribution (OOD) molecules. It should be straightforward to learn these interactions or add them directly to the model. Overall, our results demonstrate that GNNs should be benchmarked in real MD simulations, since a low test set error per se does not guarantee stable dynamics. Nevertheless, we can obtain very accurate and stable potentials with large training sets. Replacing quantum mechanical methods in MD simulations indeed seems to be in reach!
Now it’s your turn! Train your own models and use the resulting potentials in MD simulations. We also encourage you to read our MLST paper [3] to see our studies in full detail.
About the Authors
Sina Stocker
Sina Stocker is a member of the Theory Department at the Fritz Haber Institute (FHI) in Berlin led by Karsten Reuter. She has a background in computational chemistry and did her PhD on chemical and molecular machine learning in the group of Johannes Margraf (@TU Munich and FHI). As a PostDoc, her current focus is on the development of efficient data-driven approaches for the design of new and improved catalysts.
Johannes Gasteiger
Johannes Gasteiger is a research scientist at Google Research. He investigated how to leverage geometry and structure in GNNs for his PhD thesis in Stephan GĂĽnnemann's group at TU Munich. His current research is focused on the safety, interpretability, and groundedness of advanced ML modelss.
References
[1] V. L. Deringer et al. Nature 589 59 (2021).
[2] D. Frenkel and B. Smit, in Understanding molecular simulation: from algorithms to applications (Akademic Press, 1996).
[3] S. Stocker, J. Gasteiger, F. Becker, S. GĂĽnnemann and J. T. Margraf. How robust are modern graph neural network potentials in long and hot molecular dynamics simulations? Mach. Learn.: Sci. Technol. 3 045010 (2022), DOI: https://doi.org/10.1088/2632-2153/ac9955.
[4] A. H. Larsen et al. J. Phys.: Condens. Matter 29, 273002 (2017).
[5] J. Hoja et al. Sci. Data 8 43 (2021).
[6] J. Gasteiger, F. Becker and S. GĂĽnnemann in NeurIPS (2021)