Convolutions are competitive with transformers for protein sequence pretraining

Protein Language Models use the transformer architecture, which is really powerful but has a hard time with sequence length because of attention. This study looked at whether convolutional neural networks (CNNs) could be as effective as transformers because of linear sequence length scaling (a similar idea to approaches used in Hyena and HyenaDNA). Their results suggest that CNNs are competitive to transformers across downstream applications.