🤔 Are our novel Spatio-Spectral GNNs (S²GNNs) the better alternative to graph transformers?

➕ S²GNNs (https://lnkd.in/dbrCTuxH, joint work with Simon Geisler, Daniel Herbst & Stephan Günnemann) generalize the spatial + FFT convolution/global operation of sequence models like H3/Hyena/Mamba to graphs: an effective modeling paradigm via the synergy of spatially and spectrally parametrized graph convolutions.

🚀 S²GNNs are effective (and efficient). E.g., S²GCN outperforms graph transformers, MPGNNs, and graph rewirings on the peptides-func long-range benchmark (Dwivedi et al., 2022) with comfortable margin while remaining approximately 40% below the 500k parameter threshold.

💡 To achieve this, S²GNNs compose your favorite (spatial) Message Passing GNN (MPGNN) with explicitly spectrally parametrized filters (or here for short: „spectral filters“). This pairing is efficient since S²GNNs solely require a partial eigenvalue decomposition (EVD). Even though the spectral filter operates on a truncated eigendecomposition (spectrally bounded), it is spatially unbounded. Conversely, spatial MPGNNs are spatially bounded yet spectrally unbounded. Thus, S²GNNs combine the best of both worlds.

🔓 By considering spectral filters as GNN building blocks, we unlock interesting design spaces: (1) the filter parametrization, (2) the first permutation-equivariance-preserving neural network in the spectral domain, (3) the first spectral filter for directed graphs. Moreover, we identify a potential dual use of the partial eigendecomposition required for spectral filters. Namely, we propose stable positional encodings that are basically free of cost for S²GNNs and make them strictly more expressive than the 1-Weisfeiler-Lehman (WL) test.

🏔️ Next to long-range tasks on graphs (see above), we show that S²GNNs keep up with state-of-the-art sequence models on the mechanistic in-context learning task associative recall. Moreover, we demonstrate S²GNN practicality on large graphs like OGB arXiv/products or TPUGraphs.

⚛️ Side note to ML for molecules folk: this generalizes the idea of Ewald message passing (or, more widely, Ewald summation) to molecular graphs without 3D positional labels! 🙂

Twitter: https://x.com/geisler_si/status/1796571313945170384
Paper: https://arxiv.org/abs/2405.19121

https://arxiv.org/abs/2405.19121