Dataset Condensation for Data-efficient Deep Learning



Bo Zhao (University of Edinburgh)

Bo Zhao is a PhD student under Dr. Hakan Bilen in School of Informatics, The University of Edinburgh. His research interests include Machine Learning and Computer Vision, especially, Efficient Deep Learning, Meta-learning and Continual Learning. He has published papers in ICLR’21 (Oral), ICML’21,18, ACM TOG’18, SIGGRAPH Asia’16, WACV’21,19, et. al. He served as a reviewer in NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV, AAAI, IEEE TNNLS, IEEE TMC, et. al.



Short Abstract: In this talk, I will present our recent work about dataset condensation for data-efficient deep learning. Increasingly larger datasets are required to achieve the state-of-the-art in many fields, storing these datasets and training models on them become significantly more expensive. We propose a training set synthesis technique for data-efficient learning, called Dataset Condensation, that learns to condense a large dataset into a small set of informative synthetic samples for training deep neural networks from scratch. We formulate this goal as a gradient matching problem between the gradients of deep neural network weights that are trained on the original data and our synthetic data. Furthermore, we propose differentiable Siamese augmentation to enable learning synthetic data that can be effectively used to train deep neural networks with data augmentation. We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods. Finally we explore the use of our method in continual learning and neural architecture search and report promising gains when limited memory and computations are available.