With easier protein structure prediction because of AlphaFold and ESMFold, people have gone out and predicted the structure of anything they can get their hands on, leading to AlphaFold DB >200M proteins) and the ESM Metagenomic Atlas (772M metagenomic proteins).
The researchers in this study were interested in whether physicochemical descriptors (e.g. geometric measures of packing density, aggregation propensity etc) of structural models were predictive of in vivo behaviour. They calculated ~70 structural features for ~500,000 AF2 models from 48 model organisms and looked at how these properties varied at scales. They claim that there were enough systematic differences between organisms that you could reconstitute the tree of life just from these descriptors from the predicted structures. This suggests that at least for this one purpose, the representations learned by AF2 and the corresponding structures are reflective of something characteristic to the actual organism.