Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du Tran: Self-Supervised Learning by Cross-Modal Audio-Video Clustering. NeurIPS 2020