EDW 24: Data Governance and Data Quality in an Agile Environment

**This is Subscription-Only Content, It is NOT purchasable as a separate product**

As we all know, data is a new fuel and we also know we should not use bad-grade fuel in our machines, similarly, clean data is very critical to any organization and bad data should not be consumed. In fact, the life of the data is more than the life of the fuel, fuel can be consumed once and we should be recycling the same fuel that was consumed for any other purpose as it can damage the machine. Data on the other hand can be re-consumed many times with different perspectives.

  • Overview of Data Governance and Data Quality
  • Data Quality Management Framework
  • Data Quality Maturity Level
  • Proactive Data Quality in an Agile Environment
  • Detection of Data Anomalies and Its Remediation

Data is at the heart of almost every modern enterprise. Information drives sales, enables customer insight, and generates growth through repeat business. It is also an essential component of good customer service, with few organizations managing to offer a differentiated service without good data quality

To understand the role that robotic process automation can play in improving data quality, it is helpful to understand some of the root causes of poor-quality data. Though numerous, these reasons often include:

  • Simple human error.
  • Inadequate training and/or poor process adherence by users, particularly where organizations have to respond in an agile manner to seasonal business patterns, and where the use of temporary staff is commonplace
  • The existence of multiple systems with potentially overlapping data and a lack of referential integrity between records across the systems
  • Business processes containing many manual steps, often within outdated or unintuitive systems designed for a different set of requirements, thus introducing numerous opportunities for human error
  • System workarounds and reuse of data fields intended for another purpose (such as a notes field being used for mobile phone numbers), often with poor data definition and formatting - and with limited consistency and adherence by users Infrequently used data and lack of opportunity to maintain or update it
  • Inappropriate incentivization or performance measurements of staff activity, leading to rushed or poor-quality work
  • Equally, the lack of incentivization of operational staff to improve data quality problems, even where the problems are immediately apparent and the opportunity is present
  • Incomplete levels of integration between systems

Preventing data problems through validation robotic automation can help to reduce the incidence of bad data by identifying and intercepting poor data quality at the source before it enters business systems.

The validation features, described in more detail below, allow for a multitude of mechanisms, including:

  • Rules-based validation of input data, checking input formats, data lengths, data types, etc.
  • Transformation of data into the correct format – e.g., translation of dates from European format dd/mm/yyyy to US format mm/dd/yyyy
  • Verifying the presence (or absence) of data
  • Verifying low-level attributes – e.g. length, character set, data checksums (e.g. MD5), etc.
  • Complex pattern matching and transformation according to definitions are expressible using wildcards and regular expressions

From a workflow and operational standpoint, software robots allow operational teams to leverage the Pareto principle: Robots can clear the bulk of the workload whilst identifying and referring data exceptions to human teams. This elevates the role of the operational agents from performing mundane repetitive tasks to higher-value activities with greater job satisfaction and increased returns for the employer.

Data governance is a system by which the entities (Orgs, Functions, Data, etc.) are structured, directed, and controlled for decision-making, accountability, authority, and compliance.

Speaker: Prakash Kewalramani

Prakash Kewalramani is the Data Governance Advisor and a Practitioner for a fortune 500 company, where he has successfully designed, developed and implemented the data governance framework that includes digital transformation, data-driven culture, data quality, data privacy, data catalog, master data management, and reference data management components. Prakash has over 25 years of experience leading large teams to build and sell analytics and data-quality products across multiple industries. He has led the development of business intelligence solutions as an Information Architect at companies like Lockheed Martin, GE Capital, HBO. Over the course of last seven years, he helped Fortune 500 companies build data warehouse solutions across a range of industries, including finance, reinsurance, pharmaceutical, e-commerce, and manufacturing. Prakash is a thought leader, frequently speaking and writing about data, retail, and Asian leadership, and holds an engineering degree from the University of Mumbai.

Subscription Purchase Options

Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.

Share This

Whats Included


Access your courses anytime, anywhere, with a computer, tablet or smartphone

Videos, quizzes and interactive content designed for a proven learning experience

Unlimited access. Take your courses at your time and pace