ECLARE: multi-teacher contrastive learning via ensemble distillation for diagonal integration of single-cell multi-omic data

Paper: ECLARE: multi-teacher contrastive learning via ensemble distillation for diagonal integration of single-cell multi-omic data

https://www.biorxiv.org/content/10.1101/2025.01.24.634799v1

Abstract: Integrating multimodal single-cell data, such as scRNA-seq and scATAC-seq, is key for decoding gene regulatory networks but remains challenging due to issues like feature harmonization and limited quantity of paired data. To address these challenges, we introduce ECLARE, a novel framework combining multi-teacher ensemble knowledge distillation with contrastive learning for diagonal integration of single-cell multi-omic data. ECLARE trains teacher models on paired datasets to guide a student model for unpaired data, leveraging a refined contrastive objective and transport-based loss for precise cross-modality alignment. Experiments demonstrate ECLARE’s competitive performance in cell pairing accuracy, multimodal integration and biological structure preservation, indicating that multi-teacher knowledge distillation provides an effective mean to improve a diagonal integration model beyond its zero-shot capabilities. Additionally, we validate ECLARE’s applicability through a case study on major depressive disorder (MDD) data, illustrating its capability to reveal gene regulatory insights from unpaired nuclei. While current results highlight the potential of ensemble distillation in multi-omic analyses, future work will focus on optimizing model complexity, dataset scalability, and exploring applications in diverse multi-omic contexts. ECLARE establishes a robust foundation for biologically informed single-cell data integration, facilitating advanced downstream analyses and scaling multi-omic data for training advanced machine learning models.

ECLARE: multi-teacher contrastive learning via ensemble distillation for diagonal integration of single-cell multi-omic data

Previous Talks