MSA Transformer can fill in masked amino acids in multiple sequence alignments (MSAs) using the surrounding context.
This study suggests that this ability allows MSA Transformer to encode coevolution between functionally or structurally coupled amino acids within and across protein chains. They introduce a method, DiffPALM, that exploits these properties of MSA Transformer to generate paired alignments for paralogs (genes that arise from the duplication event and whose proteins have overlapping or redundant functions).
Feeding these paired alignments into AlphaFold-Multimer substantially improves structure prediction for some complexes, they say.
Pairing interacting protein sequences using masked language modeling
2