Monday, August 3, 2020

Data Augmentation using Pre-trained Transformer Models


This video explores sampling from pre-trained transformers to augment small, labeled datasets. This study compares the results of fine-tuning BERT, GPT-2, and BART for generating new data. Each technique has a distinct way of making sure the augmented data preserves the original class label such as positive or negative sentiment or a respective topic in a topic classification task. I think this is a really exciting use of generative models, showing that they are more useful than just being the first step of representation learning! Thanks for watching and please subscribe! Paper Links: Data Augmentation using Pre-trained Transformers: https://ift.tt/31gUSwX Next Word Prediction Demo: https://ift.tt/3gjQUue Conditional BERT for Contextual Augmentation: https://ift.tt/39Oz9k4 BART: https://ift.tt/2oNKlKK T5: https://ift.tt/2PpvzVe GPT-3: https://ift.tt/3et6QZt BERT: https://ift.tt/2pMXn84 GPT: https://ift.tt/2HeACni ImageGPT (images used to describe AE vs. AR): https://ift.tt/2YKKAEf Classification Accuracy Score: https://ift.tt/2U1fmaU BigGAN: https://ift.tt/328NqnC Guide to using BERT (will help understand how label embedding would work): https://ift.tt/2XR2Jzh Conditional GANs: https://ift.tt/2rPVlDw SPADE (conditional batch norm example, albeit kind of an intense example): https://ift.tt/2CsbLsZ Pre-training via Paraphrasing: https://ift.tt/2PjV9tF PEGASUS: https://ift.tt/3hhCFGM Don't Stop Pretraining: https://ift.tt/2WEdjdt Chapters 0:00 Introduction 1:16 Labeling Data is difficult! 2:15 Data Augmentation in NLP 3:18 Contextual Augmentation 4:12 Conditional BERT 6:58 BERT vs. GPT-2 vs. BART 8:53 Data Augmentation Approach 10:00 How Data is Generated 11:08 Class Label in Vocabulary? 13:07 Experiment Details 13:38 Results Extrinsic Evaluation 14:18 Classification Accuray Score used for GANs, VAEs in images 14:45 Intrinsic Analysis 16:25 Connection to Don’t Stop Pretraining 17:17 Connection with MARGE, PEGASUS, ELECTRA 18:27 Connection with Pattern-Exploiting Training

No comments:

Post a Comment