Resource of free step by step video how to guides to get you started with machine learning.
Friday, May 3, 2024
Microsoft Fabric Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations
In part 2 of this course, Barry Smart, Director of Data and AI, walks through a demo showing how you can use Microsoft Fabric to set up a "data contract" that establishes minimum data quality standards for data that is being processed by a data pipeline. He deliberately passes bad data into the pipeline to show how the process can be set up to "fail elegantly" by dropping the bad rows and continuing with only the good rows. Finally, he uses the new Teams pipeline activity in Fabric to show how you can send a message to the data stewards who are responsible for the data set, informing them that validation has failed, itemising the specific rows that failed and the validation errors that were generated in the body of the message. The demo uses the popular Titanic data set to show features in data engineering experience in Fabric, including Notebooks, Pipelines and the Lakehouse. It uses the popular Great Expectations Python package to establish the data contract and Microsoft's mssparkutils Python package to enable the exit value of the Notebook to be passed back to the Pipeline that has triggered it. Barry begins the video by explaining the architecture that is being adopted in the demo including Medallion Architecture and DataOps practices. He explains how these patterns have been applied to create a data product that provides Diagnostic Analytics of the Titanic data set. This forms part of an end to end demo of Microsoft Fabric that we will be providing as a series of videos over the coming weeks. 00:12 Overview of the architecture 00:36 The focus for this video is processing data to Silver 00:55 The DataOps principles of data validation and alerting will be applied 02:19 Tour of the artefacts in the Microsoft Fabric workspace 02:56 Open the "Validation Location" notebook and viewing the contents 03:30 Inspect the reference data that is going to be validated by the notebook 05:14 Overview of the key stages in the notebook 05:39 Set up the notebook, using %run to establish utility functions 06:21 Set up a "data contract" using great expectations package 07:45 Load the data from the Bronze area of the lake 08:18 Validate the data by applying the "data contract" to it 08:36 Remove any bad records to create a clean data set 09:04 Write the clean data to the lakehouse in Delta format 09:52 Exit the notebook using mssparkutils to pass back validation results 10:53 Lineage is used to discover the pipeline that triggers it 11:01 Exploring the "Process to Silver" pipeline 11:35 An "If Condition" is configured to process the notebook exit value 11:56 A Teams pipeline activity is set up to notify users 12:51 Title and body of Teams message are populated with dynamic information 13:08 Word of caution about exposing sensitive information 13:28 What's in the next episode? #microsoftfabric #dataengineering #greatexpectations #course #tutorial
Subscribe to:
Post Comments (Atom)
-
Using GPUs in TensorFlow, TensorBoard in notebooks, finding new datasets, & more! (#AskTensorFlow) [Collection] In a special live ep...
-
JavaやC++で作成された具体的なルールに従って動く従来のプログラムと違い、機械学習はデータからルール自体を推測するシステムです。機械学習は具体的にどのようなコードで構成されているでしょうか? 機械学習ゼロからヒーローへの第一部ではそのような疑問に応えるため、ガイドのチャー...
-
#deeplearning #noether #symmetries This video includes an interview with first author Ferran Alet! Encoding inductive biases has been a lo...
-
How to Do PS2 Filter (Tiktok PS2 Filter Tutorial), AI tiktok filter Create your own PS2 Filter photos with this simple guide! 🎮📸 Please...
-
#ai #attention #transformer #deeplearning Transformers are famous for two things: Their superior performance and their insane requirements...
-
K Nearest Neighbors Application - Practical Machine Learning Tutorial with Python p.14 [Collection] In the last part we introduced Class...
-
Machine Learning in Python using Visual Studio | Getting Started Python is a popular programming language. It was created by Guido van Ross...
-
We Talked To Sophia — The AI Robot That Once Said It Would 'Destroy Humans' [Collection] This AI robot once said it wanted to de...
-
#minecraft #neuralnetwork #backpropagation I built an analog neural network in vanilla Minecraft without any mods or command blocks. The n...
-
Programming R Squared - Practical Machine Learning Tutorial with Python p.11 [Collection] Now that we know what we're looking for, l...
No comments:
Post a Comment