
CAS, a world-leading supplier of scientific information for 116 years, provided published information and white papers on COVID starting in early 2020. One outcome of this process was a new appreciation of the power of Machine Learning (ML) technologies when used with trustworthy data sources, preferably optimized for ML. An ML-optimized repository Proof of Concept project in early 2022 was a huge success, leading to a production project in late 2022. This presentation/case study will show how CAS reinvented our data repositories, optimizing tens of millions of documents for Machine Learning and Knowledge Graphs while retaining the trust in our quality. I will show:
Lastly, I will discuss how the new repository supports our Data Scientists and examples of how better data yields better results for our customers.
Speaker: Eric N. Landers
Senior Manager, Content Engineering and Solutions
CAS

My CAS career began as a software developer, and I have had numerous roles both in Technology and the Chemical Analysis divisions, creating technical solutions for business problems and improving the quality and quantity of content available for our customers. My current role supports Data Scientists by leading the teams establishing trustworthy repositories of publicly disclosed scientific information from around the world.
Become a DATAVERSITY Insider when you subscribe and gain access to a host of special content.