Abstract: A major goal of causal representation learning is to process unstructured low-level data into high-level causal units, on which we can apply our standard toolboxes to make inferences. The assumption that such high-level factors are probabilistically realized, but merely unobserved, underlies the motivation to fit and perform inference over generative models for causal representation learning. This talk will discuss the identifiability of such models, a foundational property that describes whether inferences can be made uniquely, or at least up to tolerable ambiguities, based on observations and suitable assumptions. Significant progress has been made recently in exploring specific assumptions and techniques for identifiability---I will take a step back and attempt to examine the statistical structure of the identifiability problem itself. Specifically, I will introduce statistical modelling and identifiability from the ground up, and apply this framework to analyze generative models as a class of statistical models, obtaining generic identification results that describe the properties of the identifiability problem without assuming any specific model. Along the way, I will also discuss some historical notes from factor analysis and ICA, why these approaches differ in their definition of "strong" identifiability, and how non-linear generators can in fact admit perfectly unique solutions simply by fixing multiple latent distributions, in a modern take on factor models.
Wed, Nov 22, 4:00pm
The Statistical Structure of Identifiable Generative Models
9