Meetup #113

From Intractable to Interactable: Unleashing Sensitive Datasets with Distributed Data Science

Some of the most valuable data is also data that is not easily shared. Distributed data science is a new technique for overcoming this issue by leaving data ‘in place’; sending algorithms to the data. This enables data scientists to extract value from these datasets while ensuring strict privacy and security guarantees can be upheld. In this talk, we briefly introduce the fundamentals of distributed data science, including federated machine learning with additional privacy measures. We then show how a new, easy-to-use platform can be used to easily train models at scale on sensitive datasets. We also run through example experiments showing how without such approaches we simply cannot train ML models of sufficient quality.

In this episode

Blaise Thomson

Blaise Thomson

CEO/Founder, Bitfount

Blaise Thomson is the Founder and CEO of Bitfount, a federated machine learning, and analytics platform. He was the founder and CEO of VocalIQ, which he sold to Apple in 2015, subsequently leading their Cambridge, UK engineering office and holding the role of Chief Architect for Siri Understanding. Blaise holds a Ph.D. in Computer Science from the University of Cambridge, where he was also a Research Fellow, and is an Honorary Fellow at the Cambridge Judge Business School.

LinkedIn

Demetrios Brinkmann

Demetrios Brinkmann

Host

Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.

Ben Epstein

Ben Epstein

Host

Ben was the machine learning lead for Splice Machine, leading the development of their MLOps platform and Feature Store. He is now a founding software engineer at Galileo (rungalileo.io) focused on building data discovery and data quality tooling for machine learning teams. Ben also works as an adjunct professor at Washington University in St. Louis teaching concepts in cloud computing and big data analytics.