Mon, Dec 18, 4:00pm

SE(3)-Stochastic Flow Matching for Protein Backbone Generation

Abstract: The computational design of novel protein structures has the potential to impact numerous scientific disciplines greatly. Toward this goal, we introduce FoldFlow a series of novel generative models of increasing modeling power based on the flow-matching paradigm over 3D rigid motions -- i.e. the group SE(3) -- enabling accurate modeling of protein backbones. We first introduce FoldFlow-Base, a simulation-free approach to learning deterministic continuous-time dynamics and matching invariant target distributions on SE(3). We next accelerate training by incorporating Riemannian optimal transport to create FoldFlow-OT, leading to the construction of both more simple and stable flows. Finally, we design FoldFlow-SFM coupling both Riemannian OT and simulation-free training to learn stochastic continuous-time dynamics over SE(3). Our family of FoldFlow generative models offer several key advantages over previous approaches to the generative modeling of proteins: they are more stable and faster to train than diffusion-based approaches, and our models enjoy the ability to map any invariant source distribution to any invariant target distribution over SE(3). Empirically, we validate our FoldFlow models on protein backbone generation of up to 300 amino acids leading to high-quality designable, diverse, and novel samples.

Previous Talks