Subscribe Sponsorship Opportunities About Us Search

DGIQ-W23: Enabling Automated Data Quality Management with a Data Catalog

This is Subscription-Only Content, It is NOT purchasable as a separate product

Data catalogs can be leveraged to do much more than list an organization’s data assets and make them searchable via metadata. This presentation demonstrates one outcome of capitalizing on the capabilities of a graph-powered data catalog and incorporating it into an organization’s data quality process pipeline: automated data quality management that includes the production of reports for consumption by business stakeholders, managers, data stewards, and technical analysts and enterprise architects alike.

This case study of a big data implementation of this process is focused on the needs of the technical team and business stakeholders’ need for a standardized, continuous data quality management process for the replatform of an entity resolution system. Participants will come away with an understanding of how the catalog addressed these needs, how the solution was implemented, the benefits it provided, as well as lessons learned throughout the process.

This presentation addresses the following key points:

Identifying the utility of a data catalog as a data quality management tool
How automated data quality management fits into an organization’s larger data governance policy
How to leverage a data catalog to produce automated reports, including impact analyses and status snapshots
Describing the benefits of this process at an organizational level as it pertains to both business and technical teams
Lessons learned throughout the implementation of this automated data quality management process

Speaker: Holly Maykow

Data Management Consultant

Enterprise Knowledge LLC

Holly Maykow is a Data Management Consultant in the data and information management division at Enterprise Knowledge. She is an expert communicator with experience in both the business and technical sides of data governance and engineering engagements. She designs data strategies and pipelines and leads implement teams for knowledge graph and data catalog solutions. Holly creates documentation for algorithm logic that defines requirements and details data processing from source to final solution using language that is both unambiguous and accessible. Her work on other projects includes orchestrating the technical transition of source data to a semantic representation for integration into knowledge graph solutions, developing and implementing frameworks for semantic data representation and analysis, facilitating data analysis workshops, and consulting about general data strategy.

Speaker: Fernando Aguilar

Data Science Consultant

Enterprise Knowledge LLC

Fernando Aguilar is a Data Science Consultant in the data and information management division at Enterprise Knowledge. He has contributed directly to the optimization of a leading data catalog solution by improving the metadata profile queries used to onboard client data resources over multiple client engagements. Most recently, he has been comprehensively involved in the development of a national-scale identity graph, where he addressed data duplication and data quality issues by enabling a single self-service enterprise data catalog. His role extends to the analysis, improvement, and documentation of the identity resolution process. His contributions to several other projects include building categorical machine learning models for graph recommender systems and multi-sourced ETL pipelines to populate dashboard visualizations.