Can deep learning models trained to predict protein structure also help with predicting the mutational effects on protein-protein interactions?

Protein-protein interactions (PPI) are a vital component of the cellular language, mediating communication within and between cells. There are two key aspects to these interactions: (1) the way two proteins interact, commonly determined by solving the crystal structure of the protein complex, and (2) the strength of the interaction, typically measured experimentally as binding free energy (∆G) or referred to as binding affinity (Kd). In antibody drug discovery, a primary optimization goal is to enhance affinity toward desired targets (affinity maturation) while reducing affinity towards non-desired targets. For example, a broad-spectrum neutralizing antibody drug should bind strongly to prevalent variants of COVID-19 virus proteins to prevent immune escape, yet it should not be polyreactive, meaning it should avoid binding to unintended proteins.

Computationally, the development of AlphaFold3 has set a new benchmark in predicting crystal structures, addressing the first aspect of PPIs by showing how two proteins interact structurally. However, the second aspect of PPIs, the strength of interaction, is not addressed by AlphaFold3, which only produces static structures. From a theoretical biophysics viewpoint, the strength of interaction relates to the rate of dynamic association and dissociation of one protein with another. A stronger interaction means a higher rate of association or a lower rate of dissociation. It also implies that, if we have, say, a billion such protein complexes in a tube, a higher proportion will be in the bound (associated) states. (If you are interested in learning more about statistical physics, you can look up ‘ergodicity’, which means the time average of a physical quantity is equal to its ensemble average for a system in equilibrium.) Therefore, if a model could produce an ensemble of protein structures mimicking the proportion of bound and unbound states in reality, the strength of interaction, or binding affinity, could be inferred. This connection between structural ensembles and binding affinity leads us to hypothesize that the ranking score produced by AlphaFold3—an indicator of confidence in its structural predictions—may be sensitive to mutations and correlate with the proportion of bound states in the ensemble, and therefore, binding affinity.

In our recent preprint, we show how we tested this hypothesis by benchmarking AlphaFold3 against SKEMPI, a commonly used binding energy dataset. We demonstrate that AlphaFold3 learns unique information that synergizes with force field, profile-based, and other deep learning methods in predicting the mutational effects on protein-protein interactions. A simple ensemble of AlphaFold3’s ranking scores boosts performance across all baselines:

The ensemble score is computed by adding the equally weighted ranked scores of two models. Notably, the previous state-of-the-art, SSIPe, which is already a combination of models, also experiences a performance boost. This suggests that AlphaFold3 is bringing a new perspective, with different information than other methods provide, thereby enhancing the estimation of mutation effects on protein-protein interactions.

If AlphaFold3 is really learning complementary information, its predictions may not be correlated with those from other methods. To test that, we computed the pairwise correlation among all methods, as shown on the left of Fig. 2. AlphaFold3 exhibits very weak correlations with other models, only showing slight correlation with DSMBind. In contrast, other models, such as FlexddG and SSIPe, correlate with many other methods, indicating that AlphaFold3 learns unique features that are orthogonal to those of other methods. As shown on the right side of Fig. 2, protein language models, AlphaFold2, and strain do not provide additional information beyond what AlphaFold3 provides. Conversely, structure-based deep learning, as well as force field and profile-based methods, enhance the predictions made by AlphaFold3.

We think that AlphaFold3 captures a more global effect of mutations by learning a smoother energy landscape. However, it lacks the detailed atomic modeling provided by force field methods, which have a more rugged energy landscape. Integrating both approaches could be a promising future direction.

Our study highlights the unique value of learning from structure prediction in predicting the strength of binding interactions, thinking beyond the more limited binding energy data. This connection and the future integration of these heterogeneous data sources could help us understand the full picture of protein-protein interactions. If you're interested in learning more, check out our preprint on biorxiv!