DGIQ-EDW26: The Evolution of AI-Driven Data Quality: From Traditional ML to Large Language Models

**This is Subscription-Only Content, It is NOT purchasable as a separate product**

Data quality automation is experiencing a fundamental shift. While rule-based systems operate according to logics determined by human experts, data-driven approaches using machine learning learn from observations and training data to identify patterns associated with normal or fault conditions. Now, large language models like GPT, LLaMA, and Claude have exhibited considerable potential in data wrangling tasks, with even smaller fine-tuned 7B and 13B models showing comparable capabilities in several data cleaning tasks.

This session explores both traditional machine learning and emerging LLM approaches to data quality. Recent research investigates whether LLMs can effectively preprocess noisy textual data, with experimental results showing improvements when using LLM-cleaned captions, though statistical tests reveal most improvements are not yet significant. Attendees will learn evidence-based frameworks for when to use statistical ML versus LLM-based approaches, understand their complementary strengths, and gain practical decision criteria for implementation.

Speaker: Niruta Talwekar, Data Engineer, Meta

Niruta Talwekar is a Data Engineer at Meta with experience in building large-scale data and AI systems. She holds a Master's in Computer Information Systems from Georgia State University and is an IEEE Senior Member. Her work focuses on fairness engineering, trust in AI, and ethical automation. She was nominated for the 2025 Women in Tech Global Awards and actively mentors with Rewrite the Code.

Subscription Purchase Options

Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.

Share This

Whats Included


Access your courses anytime, anywhere, with a computer, tablet or smartphone

Videos, quizzes and interactive content designed for a proven learning experience

Unlimited access. Take your courses at your time and pace