EDW Fall 23: Data Quality Management for Big Data Processing - Operational Excellence

**This is Subscription-Only Content, It is NOT purchasable as a separate product**

FINRA plays a critical role in ensuring the integrity of the financial markets through writing and enforcing rules and by examining firms for compliance with those rules. This is done to provide investors protection and to promote confidence in the US markets. To accomplish this, FINRA oversees more than 624,000 brokers and surveils 99% of the equities markets and 70% of the options markets for manipulation, fraud, abuse, and insider trading. This requires processing up to 600+ market events per day that utilize 500+ petabytes of storage.

FINRA has established a set of Data Quality measures and checks that govern the categorization of data issues identified to determine if there is a need for data to be corrected, resubmitted, and reprocessed. This provides confidence in the data that is used in FINRA’s regulatory program and provides quicker resolution of surveillance output, and eliminates the costly process of treating all data issues in a similar fashion. We at FINRA made great strides in this area and would like to share our experiences with you!

  • Who is FINRA?
  • Concepts of Data Democratization
    • Data Literacy, Self Service Analytics, Visualizations, Governance, Security
  • Big Data processing 
    • Scale/Volume with market volatile conditions
    • Complexity of data structures and workloads
    • Challenges with quality and processing
  • What and Why does Data quality matter for Big Data workloads?
    • Impacts of poor data quality
    • Importance of data quality
  • Ensure Data Quality:
    • Quality is not about looking for needles in haystacks; it’s knowing what besides needles are hidden in there and how to find them
    • Ensuring confidence in data with a proactive approach to catch data quality issues
  • Operational Excellence:
    • Design a data infrastructure: what’s a data lake and why is it important
    • FINRA’s journey to build automated and consolidated data quality systems
    • Advanced data quality checks for large complex workloads in Big Data
      • Using Machine Learning
      • Rule based Data Quality checks
  • Observability of Workloads
    • Live Operational Dashboards
    • Alerting & Trends
    • Tracking & Reporting

Speaker: Sumalatha Bachu

Senior Director, Technology (Development Services)
FINRA

Sumalatha Bachu (Senior Director at FINRA) is an Operations Manager bringing over 22 years of diversified experience in Big Data Operations & Monitoring. She currently leads a multi-disciplinary team of professionals who run market regulation Batch/Data operations for FINRA. This includes managing petabyte-scale data and complex workloads that anchor FINRA’s Big Data architecture in the Cloud.

Sumalatha has a master’s degree in Information Technology from the University of Maryland. She lives with her husband and two kids in Maryland. She is passionate, motivated, and a coach combining technical acumen to provide innovative solutions.

Speaker: Troy Eads

Senior Director
FINRA

Troy Eads (Senior Director at FINRA) is a Market Regulation Technology manager bringing over 22 years of experience in options and equities markets. He currently leads a team that processes the datasets used by FINRA’s market regulation surveillances in both the equities and options markets. Approximately 500 billion records are read every day to create regulatory datasets between 50 - 70 billion market events in support of FINRA’s market regulation program. 

Troy lives in Indiana with his wife and is the father to four grown kids and grandfather to four energetic grandchildren. 

Speaker: Abhishek Singh

Lead Engineer
FINRA

Abhishek Singh is Data operations lead with 18 years of diversified experience in Data Operations & Analytics. He currently leads a multi-disciplinary team of professionals who run market regulation operations managing petabyte-scale data and complex workloads that anchors FINRA’s Big Data architecture in the Cloud.

Abhishek has a bachelor’s degree in Information Technology and experience working in product development and providing services across various industries. He lives with his wife and two kids in Virginia. He is passionate about upcoming technologies and IT modernization.

Subscription Purchase Options

Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.

Share This

Whats Included


Access your courses anytime, anywhere, with a computer, tablet or smartphone

Videos, quizzes and interactive content designed for a proven learning experience

Unlimited access. Take your courses at your time and pace