Sunday, August 30, 2020

This AI Creates Images Of Nearly Any Animal! 🦉


❤️ Check out Lambda here and sign up for their GPU Cloud: https://ift.tt/35NkCT7 📝 The paper "COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder" is available here: https://ift.tt/3j1K26l ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://ift.tt/2icTBUb - https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://ift.tt/2TnVBd3 Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Friday, August 28, 2020

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation (Paper Explained)


Convolutional Neural Networks have dominated image processing for the last decade, but transformers are quickly replacing traditional models. This paper proposes a fully attentional model for images by combining learned Positional Embeddings with Axial Attention. This new model can compete with CNNs on image classification and achieve state-of-the-art in various image segmentation tasks. OUTLINE: 0:00 - Intro & Overview 4:10 - This Paper's Contributions 6:20 - From Convolution to Self-Attention for Images 16:30 - Learned Positional Embeddings 24:20 - Propagating Positional Embeddings through Layers 27:00 - Traditional vs Position-Augmented Attention 31:10 - Axial Attention 44:25 - Replacing Convolutions in ResNet 46:10 - Experimental Results & Examples Paper: https://ift.tt/3jnFw1o Code: https://ift.tt/3jm3JF3 My Video on BigBird: https://youtu.be/WVPE62Gk3EM My Video on ResNet: https://youtu.be/GWt6Fu05voI My Video on Attention: https://youtu.be/iDulhoQ2pro Abstract: Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes. Authors: Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Wednesday, August 26, 2020

Radioactive data: tracing through training (Paper Explained)


#ai #research #privacy Data is the modern gold. Neural classifiers can improve their performance by training on more data, but given a trained classifier, it's difficult to tell what data it was trained on. This is especially relevant if you have proprietary or personal data and you want to make sure that other people don't use it to train their models. This paper introduces a method to mark a dataset with a hidden "radioactive" tag, such that any resulting classifier will clearly exhibit this tag, which can be detected. OUTLINE: 0:00 - Intro & Overview 2:50 - How Neural Classifiers Work 5:45 - Radioactive Marking via Adding Features 13:55 - Random Vectors in High-Dimensional Spaces 18:05 - Backpropagation of the Fake Features 21:00 - Re-Aligning Feature Spaces 25:00 - Experimental Results 28:55 - Black-Box Test 32:00 - Conclusion & My Thoughts Paper: https://ift.tt/2SlnmBn Abstract: We want to detect whether a particular image dataset has been used to train a model. We propose a new technique, \emph{radioactive data}, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. The mark is robust to strong variations such as different architectures or optimization methods. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value). Our experiments on large-scale benchmarks (Imagenet), using standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we can detect usage of radioactive data with high confidence (p < 10^-4) even when only 1% of the data used to trained our model is radioactive. Our method is robust to data augmentation and the stochasticity of deep network optimization. As a result, it offers a much higher signal-to-noise ratio than data poisoning and backdoor methods. Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Tuesday, August 25, 2020

TecoGAN: Super Resolution Extraordinaire!


❤️ Check out Weights & Biases and sign up for a free demo here: https://ift.tt/2YuG7Yf ❤️ Their instrumentation of a previous paper is available here: https://ift.tt/31cJX9g 📝 The paper "Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation" is available here: https://ift.tt/34xiSzq The legendary Wavelet Turbulence paper is available here: https://ift.tt/31pSFAG 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://ift.tt/2TnVBd3 Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Sunday, August 23, 2020

Fast reinforcement learning with generalized policy updates (Paper Explained)


#ai #research #reinforcementlearning Reinforcement Learning is a powerful tool, but it is also incredibly data-hungry. Given a new task, an RL agent has to learn a good policy entirely from scratch. This paper proposes a new framework that allows an agent to carry over knowledge from previous tasks into solving new tasks, even deriving zero-shot policies that perform well on completely new reward functions. OUTLINE: 0:00 - Intro & Overview 1:25 - Problem Statement 6:25 - Q-Learning Primer 11:40 - Multiple Rewards, Multiple Policies 14:25 - Example Environment 17:35 - Tasks as Linear Mixtures of Features 24:15 - Successor Features 28:00 - Zero-Shot Policy for New Tasks 35:30 - Results on New Task W3 37:00 - Inferring the Task via Regression 39:20 - The Influence of the Given Policies 48:40 - Learning the Feature Functions 50:30 - More Complicated Tasks 51:40 - Life-Long Learning, Comments & Conclusion Paper: https://ift.tt/2EeSW05 My Video on Successor Features: https://youtu.be/KXEEqcwXn8w Abstract: The combination of reinforcement learning with deep learning is a promising approach to tackle important sequential decision-making problems that are currently intractable. One obstacle to overcome is the amount of data needed by learning systems of this type. In this article, we propose to address this issue through a divide-and-conquer approach. We argue that complex decision problems can be naturally decomposed into multiple tasks that unfold in sequence or in parallel. By associating each task with a reward function, this problem decomposition can be seamlessly accommodated within the standard reinforcement-learning formalism. The specific way we do so is through a generalization of two fundamental operations in reinforcement learning: policy improvement and policy evaluation. The generalized version of these operations allow one to leverage the solution of some tasks to speed up the solution of others. If the reward function of a task can be well approximated as a linear combination of the reward functions of tasks previously solved, we can reduce a reinforcement-learning problem to a simpler linear regression. When this is not the case, the agent can still exploit the task solutions by using them to interact with and learn about the environment. Both strategies considerably reduce the amount of data needed to solve a reinforcement-learning problem. Authors: André Barreto, Shaobo Hou, Diana Borsa, David Silver, and Doina Precup Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Saturday, August 22, 2020

This AI Removes Shadows From Your Photos! 🌒


❤️ Check out Weights & Biases and sign up for a free demo here: https://ift.tt/2YuG7Yf ❤️ Their post on how to train distributed models is available here: https://ift.tt/2Xnrfd5 📝 The paper "Portrait Shadow Manipulation" is available here: https://ift.tt/3aUhc43 📝 Our paper with Activision Blizzard on subsurface scattering is available here: https://ift.tt/2YhJnn0 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Thursday, August 20, 2020

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study (Paper Explained)


#ai #research #machinelearning Online Reinforcement Learning is a flourishing field with countless methods for practitioners to choose from. However, each of those methods comes with a plethora of hyperparameter choices. This paper builds a unified framework for five continuous control tasks and investigates in a large-scale study the effects of these choices. As a result, they come up with a set of recommendations for future research and applications. OUTLINE: 0:00 - Intro & Overview 3:55 - Parameterized Agents 7:00 - Unified Online RL and Parameter Choices 14:10 - Policy Loss 16:40 - Network Architecture 20:25 - Initial Policy 24:20 - Normalization & Clipping 26:30 - Advantage Estimation 28:55 - Training Setup 33:05 - Timestep Handling 34:10 - Optimizers 35:05 - Regularization 36:10 - Conclusion & Comments Paper: https://ift.tt/2BWAs3B Abstract: In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literature, leading to discrepancy between published descriptions of algorithms and their implementations. This makes it hard to attribute progress in RL and slows down overall progress (Engstrom'20). As a step towards filling that gap, we implement over 50 such "choices" in a unified on-policy RL framework, allowing us to investigate their impact in a large-scale empirical study. We train over 250'000 agents in five continuous control environments of different complexity and provide insights and practical recommendations for on-policy training of RL agents. Authors: Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphael Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Wednesday, August 19, 2020

畳み込みニューラール・ネットワック(機械学習ゼーロからヒーローへ - 第3部)


機械学習ゼーロからヒーローへの第3部で、Developer Advocate の Khanh LeViet(khanhlvg @)は、畳み込みニューラルネットワークと、それらがコンピューター・ビジョンのユースケースで非常に有用である理由について説明します。 畳み込みは、画像を通過して処理し、画像に共通性を示す特徴を抽出するフィルターです。 このビデオでは、画像を処理して特徴を抽出できるかどうかを確認することで、それらがどのように機能するかを確認します。 Colab を試してみる → https://goo.gle/2m07asM Coding TensorFlow の動画を見る → https://goo.gle/Coding-TensorFlow TensorFlow のチャンネル登録をする → https://goo.gle/TensorFlow

画像分類器を作る(機械学習ゼーロからヒーローへ - 第4部)


機械学習ゼーロからヒーローへの第4部で、Developer Advocate の Khanh LeViet(khanhlvg @)は、じゃんけんの画像分類器の構築について説明します。 第1部では、じゃんけんのユースケースを紹介しましたが、じゃんけんの手を認識できるコードを書くのがどれほど難しいかについて議論しました。 このシリーズで皆さんは機械学習について様々な知識を勉強し、生のピクセルのパターンの検出からそれらの分類、畳み込みを使用した特徴の検出まで、ニューラルネットワークを構築する方法を学びました。 この動画では、シリーズの最初の3つの動画の情報をすべて1つにまとめまて、実践しました。 じゃんけんのデータセット → https://goo.gle/2m68kCV Colab を試してみる → https://goo.gle/2m07d7W Coding TensorFlow の動画を見る → https://goo.gle/Coding-TensorFlow TensorFlow のチャンネル登録をする → https://goo.gle/TensorFlow

機械学習を用いたコンピュータービジョンの基本(機械学習ゼロからヒーローへ 第二部)


機械学習ゼロからヒーローへの第二部では、画像に写っているものをコンピューターに認識させる機械学習についてチャールズが解説します。 リンク先でコンピュータビジョンのコードを実行してみましょう!→ https://goo.gle/34cHkDk コーディング・テンサーフローのプレイリスト → https://goo.gle/Coding-TensorFlow テンサーフローをチャンネル登録 → https://goo.gle/TensorFlow

初めての機械学習(機械学習ゼロからヒーローへ 第一部)


JavaやC++で作成された具体的なルールに従って動く従来のプログラムと違い、機械学習はデータからルール自体を推測するシステムです。機械学習は具体的にどのようなコードで構成されているでしょうか? 機械学習ゼロからヒーローへの第一部ではそのような疑問に応えるため、ガイドのチャールズがシンプルな具体例を使って機械学習モデルを作成する手順を解説します。ここで語られるいくつかのコンセプトは、コンピュータビジョンなどのもっと実用的な機械学習の使用法について学べる続編動画でも応用されます。 リンク先で機械学習のハローワールドに挑戦してみましょう!→ https://goo.gle/2Zp2ZF3 コーディング・テンサーフローのプレイリスト→ https://goo.gle/Coding-TensorFlow テンサーフローをチャンネル登録→ https://goo.gle/TensorFlow

Tuesday, August 18, 2020

How Can We Simulate Water Droplets? 🌊


❤️ Check out Linode here and get $20 free credit on your account: https://ift.tt/2LaDQJb 🎬Our Instagram page with the slow-motion videos is available here: https://ift.tt/2KBCNkT 📝 The paper "Codimensional Surface Tension Flow using Moving-Least-SquaresParticles" is available here: https://ift.tt/3kUugLn https://ift.tt/3h9jT47 ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://ift.tt/2icTBUb - https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

[Rant] REVIEWER #2: How Peer Review is FAILING in Machine Learning


#ai #research #peerreview Machine Learning research is in dire straits as more people flood into the field and competent reviewers are scarce and overloaded. This video takes a look at the incentive structures behind the current system and describes how they create a negative feedback loop. In the end, I'll go through some proposed solutions and add my own thoughts. OUTLINE: 0:00 - Intro 1:05 - The ML Boom 3:10 - Author Incentives 7:00 - Conference Incentives 8:00 - Reviewer Incentives 13:10 - Proposed Solutions 17:20 - A Better Solution 23:50 - The Road Ahead Sources: https://ift.tt/3aLZlfP https://ift.tt/2YaeOkH https://ift.tt/31ZMDWA https://ift.tt/2Y8ySUf https://ift.tt/2Q06C1X https://ift.tt/2Yb8byw https://ift.tt/2YaeOkH https://ift.tt/2Y7Ny6c https://ift.tt/2CB5BKs https://ift.tt/2Zq0kfW https://ift.tt/346SFaJ https://ift.tt/1OQp2ts https://twitter.com/tdietterich/status/1292217162103316481 https://ift.tt/2CHF9z2 Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Saturday, August 15, 2020

From Video Games To Reality…With Just One AI!


❤️ Check out Lambda here and sign up for their GPU Cloud: https://ift.tt/35NkCT7 📝 The paper "World-Consistent Video-to-Video Synthesis" is available here: https://ift.tt/3kMWqYy ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://ift.tt/2icTBUb - https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Curriculum Learning for Recurrent Video Object Segmentation - Maria Gonzalez - ECCV Workshops 2020


Project page: https://ift.tt/30ZI9jG Video object segmentation can be understood as a sequence-to-sequence task that can benefit from the curriculum learning strategies for better and faster training of deep neural networks. This work explores different schedule sampling and frame skipping variations to significantly improve the performance of a recurrent architecture. Our results on the car class of the KITTI-MOTS challenge indicate that, surprisingly, an inverse schedule sampling is a better option than a classic forward one. Also, that a progressive skipping of frames during training is beneficial, but only when training with the ground truth masks instead of the predicted ones.

Friday, August 14, 2020

REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)


Open Domain Question Answering is one of the most challenging tasks in NLP. When answering a question, the model is able to retrieve arbitrary documents from an indexed corpus to gather more information. REALM shows how Masked Language Modeling (MLM) pretraining can be used to train a retriever for relevant documents in an end-to-end fashion and improves over state-of-the-art by a significant margin. OUTLINE: 0:00 - Introduction & Overview 4:30 - World Knowledge in Language Models 8:15 - Masked Language Modeling for Latent Document Retrieval 14:50 - Problem Formulation 17:30 - Knowledge Retriever Model using MIPS 23:50 - Question Answering Model 27:50 - Architecture Recap 29:55 - Analysis of the Loss Gradient 34:15 - Initialization using the Inverse Cloze Task 41:40 - Prohibiting Trivial Retrievals 44:05 - Null Document 45:00 - Salient Span Masking 50:15 - My Idea on Salient Span Masking 51:50 - Experimental Results and Ablations 57:30 - Concrete Example from the Model Paper: https://ift.tt/2HItI93 Code: https://ift.tt/3gYMqJF My Video on GPT-3: https://www.youtube.com/watch?v=SY5PvZrJhLE My Video on BERT: https://www.youtube.com/watch?v=-9evrZnBorM My Video on Word2Vec: https://www.youtube.com/watch?v=yexR53My2O4 Abstract: Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity. Authors: Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Thursday, August 13, 2020

Well-Read Students Learn Better


Should you pre-train your compressed transformer model before knowledge distillation from an off-the-shelf teacher? This paper says yes and explores a few details behind this pipeline. Thanks for watching! Please Subscribe! Paper Links: Well-Read Students Learn Better: https://ift.tt/3amnUiT Patient Knowledge Distillation: https://ift.tt/3fNIb28 DistilBERT: https://ift.tt/2Y2cZa2 Don't Stop Pretraining: https://ift.tt/2WEdjdt SimCLRv2: https://ift.tt/2PQ2MrW AllenNLP MLM Demo: https://ift.tt/30TA9QV. HuggingFace Transformers: https://ift.tt/38KJ1K4

Wednesday, August 12, 2020

Meta-Learning through Hebbian Plasticity in Random Networks (Paper Explained)


#ai #neuroscience #rl Reinforcement Learning is a powerful tool, but it lacks biological plausibility because it learns a fixed policy network. Animals use neuroplasticity to reconfigure their policies on the fly and quickly adapt to new situations. This paper uses Hebbian Learning, a biologically inspired technique, to have agents adapt random networks to high-performing solutions as an episode is progressing, leading to agents that can reconfigure themselves in response to new observations. OUTLINE: 0:00 - Intro & Overview 2:30 - Reinforcement Learning vs Hebbian Plasticity 9:00 - Episodes in Hebbian Learning 10:00 - Hebbian Plasticity Rules 18:10 - Quadruped Experiment Results 21:20 - Evolutionary Learning of Hebbian Plasticity 29:10 - More Experimental Results 34:50 - Conclusions 35:30 - Broader Impact Statement Videos: https://twitter.com/risi1979/status/1280544779630186499 Paper: https://ift.tt/2CaB2L7 Abstract: Lifelong learning and adaptability are two defining aspects of biological agents. Modern reinforcement learning (RL) approaches have shown significant progress in solving complex tasks, however once training is concluded, the found solutions are typically static and incapable of adapting to new information or perturbations. While it is still not completely understood how biological brains learn and adapt so efficiently from experience, it is believed that synaptic plasticity plays a prominent role in this process. Inspired by this biological mechanism, we propose a search method that, instead of optimizing the weight parameters of neural networks directly, only searches for synapse-specific Hebbian learning rules that allow the network to continuously self-organize its weights during the lifetime of the agent. We demonstrate our approach on several reinforcement learning tasks with different sensory modalities and more than 450K trainable plasticity parameters. We find that starting from completely random weights, the discovered Hebbian rules enable an agent to navigate a dynamical 2D-pixel environment; likewise they allow a simulated 3D quadrupedal robot to learn how to walk while adapting to different morphological damage in the absence of any explicit reward or error signal. Authors: Elias Najarro, Sebastian Risi Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Tuesday, August 11, 2020

Can We Simulate Tearing Meat? 🥩


❤️ Check out Snap's Residency Program and apply here: https://ift.tt/3jfqDPm ❤️ Try Snap's Lens Studio here: https://ift.tt/2ArswSh 🎬Our Instagram page with the slow-motion videos is available here: https://ift.tt/2KBCNkT 📝 The paper "AnisoMPM: Animating Anisotropic Damage Mechanics" is available here: https://ift.tt/3fL21eh ❗Erratum: At 4:17, I should have written "Anisotropic damage (new method)". Apologies! 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Easy Data Augmentation for Text Classification


This video explains a great baseline for exploring data augmentation in NLP and text classification particularly. Synonym replacement, random insertion/deletion/swapping can all be quickly implemented and don't require much overhead (compared to say back-translation or generative modeling). If you put this to use, please share your experience! Thanks for watching! Please Subscribe! Paper Links Easy Data Augmentation: https://ift.tt/3fnqupG Conditional BERT: https://ift.tt/31DSSiE RandAugment: https://ift.tt/2VODOeX CheckList: https://ift.tt/3ktqTLj Chapters 0:00 Beginning 1:08 4 Operations Explored 3:13 Results 4:55 Hyperparameters of Augmentation 6:10 Ablations on Isolated Augmentations 8:53 t-SNE viz of Label Preservation 9:45 Contrast with Conditional BERT 10:41 Issue with Test Set Evaluation of Data Aug 12:02 Connection with RandAugment 13:44 Quick Takeaways from EDA

Sunday, August 9, 2020

Hopfield Networks is All You Need (Paper Explained)


Hopfield Networks are one of the classic models of biological memory networks. This paper generalizes modern Hopfield Networks to continuous states and shows that the corresponding update rule is equal to the attention mechanism used in modern Transformers. It further analyzes a pre-trained BERT model through the lens of Hopfield Networks and uses a Hopfield Attention Layer to perform Immune Repertoire Classification. OUTLINE: 0:00 - Intro & Overview 1:35 - Binary Hopfield Networks 5:55 - Continuous Hopfield Networks 8:15 - Update Rules & Energy Functions 13:30 - Connection to Transformers 14:35 - Hopfield Attention Layers 26:45 - Theoretical Analysis 48:10 - Investigating BERT 1:02:30 - Immune Repertoire Classification Paper: https://ift.tt/33v47N1 Code: https://ift.tt/3kfo4xh Immune Repertoire Classification Paper: https://ift.tt/33GPGFC My Video on Attention: https://youtu.be/iDulhoQ2pro My Video on BERT: https://youtu.be/-9evrZnBorM Abstract: We show that the transformer attention mechanism is the update rule of a modern Hopfield network with continuous states. This new Hopfield network can store exponentially (with the dimension) many patterns, converges with one update, and has exponentially small retrieval errors. The number of stored patterns is traded off against convergence speed and retrieval error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. The gradient in transformers is maximal for metastable states, is uniformly distributed for global averaging, and vanishes for a fixed point near a stored pattern. Using the Hopfield network interpretation, we analyzed learning of transformer and BERT models. Learning starts with attention heads that average and then most of them switch to metastable states. However, the majority of heads in the first layers still averages and can be replaced by averaging, e.g. our proposed Gaussian weighting. In contrast, heads in the last layers steadily learn and seem to use metastable states to collect information created in lower layers. These heads seem to be a promising target for improving transformers. Neural networks with Hopfield networks outperform other methods on immune repertoire classification, where the Hopfield net stores several hundreds of thousands of patterns. We provide a new PyTorch layer called "Hopfield", which allows to equip deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. GitHub: this https URL Authors: Hubert Ramsauer, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, Michael Widrich, Lukas Gruber, Markus Holzleitner, Milena Pavlović, Geir Kjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Saturday, August 8, 2020

OpenAI GPT-3 - Good At Almost Everything! 🤖


❤️ Check out Weights & Biases and sign up for a free demo here: https://ift.tt/2YuG7Yf ❤️ Their instrumentation of a previous OpenAI paper is available here: https://ift.tt/2XLLkJR 📝 The paper "Language Models are Few-Shot Learners" is available here: - https://ift.tt/2Xdo3Ac - https://ift.tt/2BXmy18 Credits follow for the tweets of the applications. Follow their authors if you wish to see more! Website layout: https://twitter.com/sharifshameem/status/1283322990625607681 Plots: https://twitter.com/aquariusacquah/status/1285415144017797126?s=12 Typesetting math: https://twitter.com/sh_reya/status/1284746918959239168 Population data: https://twitter.com/pavtalk/status/1285410751092416513 Legalese: https://twitter.com/f_j_j_/status/1283848393832333313 Nutrition labels: https://twitter.com/lawderpaul/status/1284972517749338112 User interface design: https://twitter.com/jsngr/status/1284511080715362304 More cool applications: Generating machine learning models - https://twitter.com/mattshumer_/status/1287125015528341506?s=12 Creating animations - https://twitter.com/ak92501/status/1284553300940066818 Command line magic - https://twitter.com/super3/status/1284567835386294273?s=12 Analogies: https://twitter.com/melmitchell1/status/1291170016130412544?s=12 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://ift.tt/2icTBUb Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://ift.tt/2TnVBd3 Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m #GPT3 #GPT2

Thursday, August 6, 2020

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)


#ai #tech #code A whole bunch of humans are arguing whether 2+2=4 or 2+2=5. Pointless! Let the machines handle this! Colab: https://ift.tt/2DkPVeW Disclaimer: This is a joke. Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Wednesday, August 5, 2020

Baton Master - My 1st Virtual Reality Game


The future of Education is gaming, and in order to build that future we need to start building high quality, engaging educational games. In this tutorial, I show you how I built my first game for the Oculus Virtual Reality platform in C# using Unity. Baton Master lets you practice conducting in a simulated orchestral environmental. I wanted to understand the process of encoding real-world knowledge into game mechanics. I recorded a 3 hour screencast of me rebuilding it from scratch but i condensed it as much as I could to focus on the most relevant parts for you. I'm also releasing the code so please use it as you need it to learn, build, and explore the incredible world of Virtual Reality. My favorite VR games are: Echo Arena, Beat Saber, Big Screen, The Under Presents, Pistol Whip, SuperHot, and of course Half Life: Alyx. Subscribe for more educational videos! Baton Master code: https://ift.tt/2Dld4O4 Twitter: https://twitter.com/sirajraval Learning Resources I've been using: Valem's Youtube Channel: https://www.youtube.com/channel/UCPJlesN59MzHPPCp0Lg8sLw Unity's native tutorials: https://ift.tt/2yQvLDK Reality is Broken by Jane McGonigal: https://ift.tt/3kdDUs8 Blood, Sweat, & Pixels by Jason Schrier: https://ift.tt/2DmOUTt P.S- I've got at least 1 coding music video left in me.

Tuesday, August 4, 2020

Physics in 4 Dimensions…How?


❤️ Check out Weights & Biases and sign up for a free demo here: https://ift.tt/2YuG7Yf ❤️ Their mentioned post is available here: https://ift.tt/2XsLnty 📝 The paper "N-Dimensional Rigid Body Dynamics" is available here: https://ift.tt/3fxoxaq Check out these two 4D games here: 4D Toys: https://4dtoys.com/ Miegakure (still in the works): https://miegakure.com/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. More info if you would like to appear here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

Contrastive Learning for Unpaired Image-to-Image Translation


Contrastive learning has provided a huge boost in self-supervised representation learning. This paper shows that this can even improve other self-supervised learning algorithms, like generative models in the GAN framework. I am really excited about how image to image translation networks can be used for domain similarity analysis. Thanks for watching! Please Subscribe! Paper Links: Contrastive Unpaired Translation (contains code, video, website, and paper link): https://ift.tt/39P6UBD Contrastive Predictive Coding: https://ift.tt/2SUqOTJ SinGAN: https://ift.tt/2WnPSG5 EfficientDet: https://ift.tt/2xQJ7m7 Feature Pyramid Networks for Object Detection: https://ift.tt/2sI9Z08 Don't Stop Pretraining: https://ift.tt/2WEdjdt CycleGAN: https://ift.tt/2opD3rk SimCLR: https://ift.tt/31TZZTM MoCo: https://ift.tt/2pggQ4i On the Measure of Intelligence: https://ift.tt/36TesBG

PCGRL: Procedural Content Generation via Reinforcement Learning (Paper Explained)


Deep RL is usually used to solve games, but this paper turns the process on its head and applies RL to game level creation. Compared to traditional approaches, it frames level design as a sequential decision making progress and ends up with a fast and diverse level generator. OUTLINE: 0:00 - Intro & Overview 1:30 - Level Design via Reinforcement Learning 3:00 - Reinforcement Learning 4:45 - Observation Space 5:40 - Action Space 15:40 - Change Percentage Limit 20:50 - Quantitative Results 22:10 - Conclusion & Outlook Paper: https://ift.tt/2GrzqLJ Code: https://ift.tt/2S1zmYu Abstract: We investigate how reinforcement learning can be used to train level-designing agents. This represents a new approach to procedural content generation in games, where level design is framed as a game, and the content generator itself is learned. By seeing the design problem as a sequential task, we can use reinforcement learning to learn how to take the next action so that the expected final level quality is maximized. This approach can be used when few or no examples exist to train from, and the trained generator is very fast. We investigate three different ways of transforming two-dimensional level design problems into Markov decision processes and apply these to three game environments. Authors: Ahmed Khalifa, Philip Bontrager, Sam Earle, Julian Togelius Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Monday, August 3, 2020

Data Augmentation using Pre-trained Transformer Models


This video explores sampling from pre-trained transformers to augment small, labeled datasets. This study compares the results of fine-tuning BERT, GPT-2, and BART for generating new data. Each technique has a distinct way of making sure the augmented data preserves the original class label such as positive or negative sentiment or a respective topic in a topic classification task. I think this is a really exciting use of generative models, showing that they are more useful than just being the first step of representation learning! Thanks for watching and please subscribe! Paper Links: Data Augmentation using Pre-trained Transformers: https://ift.tt/31gUSwX Next Word Prediction Demo: https://ift.tt/3gjQUue Conditional BERT for Contextual Augmentation: https://ift.tt/39Oz9k4 BART: https://ift.tt/2oNKlKK T5: https://ift.tt/2PpvzVe GPT-3: https://ift.tt/3et6QZt BERT: https://ift.tt/2pMXn84 GPT: https://ift.tt/2HeACni ImageGPT (images used to describe AE vs. AR): https://ift.tt/2YKKAEf Classification Accuracy Score: https://ift.tt/2U1fmaU BigGAN: https://ift.tt/328NqnC Guide to using BERT (will help understand how label embedding would work): https://ift.tt/2XR2Jzh Conditional GANs: https://ift.tt/2rPVlDw SPADE (conditional batch norm example, albeit kind of an intense example): https://ift.tt/2CsbLsZ Pre-training via Paraphrasing: https://ift.tt/2PjV9tF PEGASUS: https://ift.tt/3hhCFGM Don't Stop Pretraining: https://ift.tt/2WEdjdt Chapters 0:00 Introduction 1:16 Labeling Data is difficult! 2:15 Data Augmentation in NLP 3:18 Contextual Augmentation 4:12 Conditional BERT 6:58 BERT vs. GPT-2 vs. BART 8:53 Data Augmentation Approach 10:00 How Data is Generated 11:08 Class Label in Vocabulary? 13:07 Experiment Details 13:38 Results Extrinsic Evaluation 14:18 Classification Accuray Score used for GANs, VAEs in images 14:45 Intrinsic Analysis 16:25 Connection to Don’t Stop Pretraining 17:17 Connection with MARGE, PEGASUS, ELECTRA 18:27 Connection with Pattern-Exploiting Training

Sunday, August 2, 2020

Big Bird: Transformers for Longer Sequences (Paper Explained)


#ai #nlp #attention The quadratic resource requirements of the attention mechanism are the main roadblock in scaling up transformers to long sequences. This paper replaces the full quadratic attention mechanism by a combination of random attention, window attention, and global attention. Not only does this allow the processing of longer sequences, translating to state-of-the-art experimental results, but also the paper shows that BigBird comes with theoretical guarantees of universal approximation and turing completeness. OUTLINE: 0:00 - Intro & Overview 1:50 - Quadratic Memory in Full Attention 4:55 - Architecture Overview 6:35 - Random Attention 10:10 - Window Attention 13:45 - Global Attention 15:40 - Architecture Summary 17:10 - Theoretical Result 22:00 - Experimental Parameters 25:35 - Structured Block Computations 29:30 - Recap 31:50 - Experimental Results 34:05 - Conclusion Paper: https://ift.tt/32Za4BA My Video on Attention: https://youtu.be/iDulhoQ2pro My Video on BERT: https://youtu.be/-9evrZnBorM My Video on Longformer: https://youtu.be/_8KNb5iqblE ... and its memory requirements: https://youtu.be/gJR28onlqzs Abstract: Transformers-based models, such as BERT, have been one of the most successful deep learning models for NLP. Unfortunately, one of their core limitations is the quadratic dependency (mainly in terms of memory) on the sequence length due to their full attention mechanism. To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear. We show that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model. Along the way, our theoretical analysis reveals some of the benefits of having O(1) global tokens (such as CLS), that attend to the entire sequence as part of the sparse attention mechanism. The proposed sparse attention can handle sequences of length up to 8x of what was previously possible using similar hardware. As a consequence of the capability to handle longer context, BigBird drastically improves performance on various NLP tasks such as question answering and summarization. We also propose novel applications to genomics data. Authors: Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/2Zo6XRA If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Saturday, August 1, 2020

These AI-Driven Characters Dribble Like Mad! 🏀


❤️ Check out Weights & Biases and sign up for a free demo here: https://ift.tt/2YuG7Yf ❤️ Their mentioned post is available here: https://ift.tt/3k4gZPO 📝 The paper "Local Motion Phases for Learning Multi-Contact Character Movements" is available here: https://ift.tt/2GgCLuP 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Gordon Child, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. More info if you would like to appear here: https://ift.tt/2icTBUb Károly Zsolnai-Fehér's links: Instagram: https://ift.tt/2KBCNkT Twitter: https://twitter.com/twominutepapers Web: https://ift.tt/1NwkG9m

The Hardest Part of Gamedev


You've played many games in your life, but have you ever wondered what is the hardest part about game development? I have the answers for you! This video is sponsored by RAID: Shadow Legends! DOWNLOAD RAID PC & MAC: https://bit.ly/jraidpc DOWNLOAD RAID iOS: https://bit.ly/jraidios DOWNLOAD RAID Android: https://bit.ly/jraidandr Thanks to Wakuya for helping with the PSA: https://www.youtube.com/user/sswakl25 Watch the full livestream: https://youtu.be/M1dsjIqZMoA SUBSCRIBE FOR MORE: http://jabrils.com/yt WISHLIST MY VIDEO GAME: https://ift.tt/33NgHFz SUPPORT ON PATREON: https://ift.tt/2pZACkg JOIN DISCORD: https://ift.tt/2QkDa9O Please follow me on social networks: twitter: https://twitter.com/jabrils_ instagram: https://ift.tt/2QNVYvI REMEMBER TO ALWAYS FEED YOUR CURIOSITY