Wednesday, May 27, 2020

mixup: Beyond Empirical Risk Minimization (Paper Explained)


Neural Networks often draw hard boundaries in high-dimensional space, which makes them very brittle. Mixup is a technique that linearly interpolates between data and labels at training time and achieves much smoother and more regular class boundaries. OUTLINE: 0:00 - Intro 0:30 - The problem with ERM 2:50 - Mixup 6:40 - Code 9:35 - Results https://ift.tt/2M0bHW1 Abstract: Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks. Authors: Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB

No comments:

Post a Comment