DGIQ-E23: Techniques for Improving Data Quality: The Key to Machine Learning

**This is Subscription-Only Content, It is NOT purchasable as a separate product**

One of the fundamental challenges for machine learning (ML) teams is data quality, or more accurately the lack of data quality. Your ML solution is only as good as the data that you train it on, and therein lies the rub: Is your data of sufficient quality to train a trustworthy system? If not, can you improve your data so that it is? You need a collection of data quality “best practices”, but what is “best” depends on the context of the problem that you face.  Which of the myriad of strategies are the best ones for you?

This presentation compares over a dozen traditional and agile data quality techniques on five factors: timeliness of action, level of automation, directness, timeliness of benefit, and difficulty to implement. The data quality techniques explored include: data cleansing, automated regression testing, data guidance, synthetic training data, database refactoring, data stewards, manual regression testing, data transformation, data masking, data labeling, and more. When you understand what data quality techniques are available to you, and understand the context in which they’re applicable, you will be able to identify the collection of data quality techniques that are best for you.

Speaker: Scott W. Ambler

Consulting Methodologist, Ambysoft Inc.

Scott Ambler is a Consulting Methodologist with Ambysoft Inc., leading the evolution of the Agile Data and Agile Modeling methods. Scott helps organizations around the world to improve their way of working (WoW) around software modeling and data-oriented activities. Scott was the (co)-creator of PMI’s Disciplined Agile (DA) tool kit and has helped organizations around the world to improve their way of working (WoW). Scott is an international keynote speaker and the (co-)author of 30 books.

Subscription Purchase Options

Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.

Share This

Whats Included


Access your courses anytime, anywhere, with a computer, tablet or smartphone

Videos, quizzes and interactive content designed for a proven learning experience

Unlimited access. Take your courses at your time and pace