Tuesday, August 2, 2022

TensorFlow Serving performance optimization


Wei Wei, Developer Advocate at Google, shares general principles and best practices to improve TensorFlow Serving performance. He discusses how to improve the latency for API surfaces, batching, and more parameters that you can tune. Resources: TensorFlow Serving performance guide → https://goo.gle/3zW168E Profile Inference Requests with TensorBoard → https://goo.gle/3zWjluJ TensorFlow Serving batching configuration → https://goo.gle/3xT2SVz TensorFlow Serving SavedModel Warmup → https://goo.gle/3ygfIhT XLA homepage → https://goo.gle/3zY01gw How to make TensorFlow models run faster on GPUs (with XLA) → https://goo.gle/3OAB8LR How OpenX Trains and Serves for a Million Queries per Second in under 15 Milliseconds → https://goo.gle/3NdAOSd ResNet complete example → https://goo.gle/3zU1PHs Deploying Production ML Models with TensorFlow Serving playlist → Subscribe to TensorFlow → https://goo.gle/TensorFlow #TensorFlow

No comments:

Post a Comment