
Today, enterprises have more tools to create and share information than ever before in our technological history. As emails, text documents, sensor data, and rich media consume more and more of our data infrastructure each year, the environmental impact of this information is increasing exponentially. For example, the average carbon footprint of an email is 4g COe – with an attachment, 50g. Because of such phenomenal data growth, effective data management must be part of any organization’s environmental, social, and governance (ESG) goals.
In this talk, presenters will share how a large supply chain organization is employing a data-driven green strategy, rooted in real carbon footprint statistics, to support information management sustainability at the organization. Enterprise Knowledge will present how a machine learning framework is being applied to automatically identify duplicates and near-duplicates across content repositories at scale. The presentation will also address how this information is then used for generating aggregate statistics and resultant carbon footprint to push for a cultural shift across the organization towards greener information management.
Join us in this session to find out:
What is a Green Information Management (IM) Strategy, and why should you have one?
How can artificial intelligence (AI) and machine learning (ML) support Green IM Strategy through content deduplication?
How can my organization use insights into our data to influence user behavior with respect to information management?
How can I reap additional benefits of content reduction that go beyond ESG goals?
Speaker: Urmi Majumder

Urmi Majumder is a Principal Consultant in the Advanced Data and Enterprise AI practice at Enterprise Knowledge where she leads system architecture, design, and implementation of a broad range of enterprise solutions. She has 15 years of experience leading the development of technical solutions in support of a wide variety of federal and commercial clients by integrating open-source, SaaS, and COTS tools, as well as establishing the connection between these tools and their business users. Her diverse portfolio includes the design and development of data-centric solutions including content management systems, record management systems, knowledge portals, search applications, semantic applications, data catalogs, and AI/ML applications, both in the context of new system development and data modernization efforts.
Speaker: Fernando Aguilar

Fernando Aguilar is a Data Science Consultant in the data and information management division at Enterprise Knowledge. He has contributed directly to the optimization of a leading data catalog solution by improving the metadata profile queries used to onboard client data resources over multiple client engagements. Most recently, he has been comprehensively involved in the development of a national-scale identity graph, where he addressed data duplication and data quality issues by enabling a single self-service enterprise data catalog. His role extends to the analysis, improvement, and documentation of the identity resolution process. His contributions to several other projects include building categorical machine-learning models for graph recommender systems and multi-sourced ETL pipelines to populate dashboard visualizations.
Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.