Thursday, December 30, 2021

A friendly introduction to distributed training (ML Tech Talks)


Google Cloud Developer Advocate Nikita Namjoshi introduces how distributed training models can dramatically reduce machine learning training times, explains how to make use of multiple GPUs with Data Parallelism vs Model Parallelism, and explores Synchronous vs Asynchronous Data Parallelism. Mesh TensorFlow → https://goo.gle/3sFPrHw Distributed Training with Keras tutorial → https://goo.gle/3FE6QEa GCP Reduction Server Blog → https://goo.gle/3EEznYB Multi Worker Mirrored Strategy tutorial → https://goo.gle/3JkQT7Y Parameter Server Strategy tutorial → https://goo.gle/2Zz3UrW Distributed training on GCP Demo → https://goo.gle/3pABNDE Chapters: 0:00 - Introduction 00:17 - Agenda 00:37 - Why distributed training? 1:49 - Data Parallelism vs Model Parallelism 6:05 - Synchronous Data Parallelism 18:20 - Asynchronous Data Parallelism 23:41 Thank you for watching Watch more ML Tech Talks → https://goo.gle/ml-tech-talks Subscribe to TensorFlow → https://goo.gle/TensorFlow #TensorFlow #MachineLearning #ML product: TensorFlow - General;

No comments:

Post a Comment