T05 EDW Tutorial: Centralized Data Governance of a Distributed Data Landscape

Many companies today are facing the complexity of governing data that is scattered across multiple types of data stored on-premises, in multiple clouds, in SaaS applications, and at the edge. Somehow, they have to know what data they have out there, where it is, what data is deemed personally identifiable information, what data is considered confidential versus internal use only versus public, and how to manage data privacy, data access security, data loss prevention, data sharing, data usage, data retention, and data quality across the entire environment.

T5 EDW Tutorial: Centralized Data Governance of a Distributed Data Landscape

Time: Monday, March 27th, 11.30am – 2:30pm Pacific Time (PST)/2:30pm – 5:30pm Eastern Time (EST)

This Enterprise Data World (EDW) Digital Tutorial can be purchased on the official conference website, but the live tutorial will occur within the DATAVERSITY Training Center. You will receive your unique login credentials and confirmation once you've registered for this paid Tutorial at: https://edw2023digital.dataversity.net/registration-welcome.cfm

Registration for this Tutorial will also give you access to all active tutorials within the same time slot.

All paid Tutorial registrations include access to the free 2-day program Tuesday-Wednesday. Please note that you will receive separate login instructions for the free program, as it will take place on a different platform than the Tutorials. See your registration confirmation for details.  

The EDW Digital Conference site can be found at: https://edw2023digital.dataversity.net/index.cfm

Tutorial Description

Many companies today are facing the complexity of governing data that is scattered across multiple types of data stored on-premises, in multiple clouds, in SaaS applications, and at the edge. Somehow, they have to know what data they have out there, where it is, what data is deemed personally identifiable information, what data is considered confidential versus internal use only versus public, and how to manage data privacy, data access security, data loss prevention, data sharing, data usage, data retention, and data quality across the entire environment.

Also, it is not just structured data in files and databases that needs to be governed. What about office documents on laptops and file shares, SharePoint sites, email, web chat, and meetings? Some subsets of these may be considered confidential. In an era where data protection is critical, and data privacy may require compliance with multiple laws in different regions, countries, and states, the challenge is now to be able to govern data across a distributed landscape.

This session looks at this problem, defines the requirements to deal with it, and looks at what is needed from an organizational, process, policies, and technology perspective to solve it.

  • Data governance redefined - Data Quality, data privacy, data access security, data loss prevention, data sharing, data usage, and data retention
  • The ever-increasing distributed data landscape - multiple types of data stored from on-premises to the edge
  • The challenge of governing data in this kind of environment
  • The need for multiple Data Governance classification schemes in order to govern data
  • Implementing governance classification across office document stores and structured data stores in a distributed data landscape
  • The role of the data catalog
  • Training classifiers for automatic classification of structured and unstructured data
  • Centrally defining policies and rules to govern distributed data
  • Enforcing governance policies and rules across a distributed data landscape using governance workflows
  • Implementation challenges

Speaker: Mike Ferguson

Mike Ferguson is Managing Director of Intelligent Business Strategies. An independent IT industry analyst, he specializes in analytics, Data Management, big data, and enterprise architecture. With over 40 years of experience, Mike has consulted for dozens of companies on BI/Analytics, data strategy, technology selection, enterprise architecture, and Data Management. Mike is also conference chairman of Big Data LDN, the fastest-growing data and analytics conference in Europe. He has spoken at events all over the world and written numerous articles. He was formerly a principal and co-founder of Codd and Date Europe – the inventors of the Relational Model, and a Chief Architect at Teradata. He teaches classes in: Data Warehouse Modernization, Big Data Architecture & Technology, Centralized Data Governance of a Distributed Data Landscape, Practical Guidelines for Implementing a Data Mesh, Embedded Analytics, Intelligent Apps & AI Automation, Migrating your Data Warehouse to the Cloud, Modern Data Architecture and Data Virtualization.

Subscription Purchase Options

Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.

Share This

Whats Included


Access your courses anytime, anywhere, with a computer, tablet or smartphone

Videos, quizzes and interactive content designed for a proven learning experience

Unlimited access. Take your courses at your time and pace