Coffee Sessions #122

Scaling Similarity Learning at Digits

Machine Learning in a product is a double-edged sword. It can make a product more useful but it depends on assumed and strictly defined behavior from users. Hannes walks through the entirety of their machine learning pipeline, how they implemented it, what the elements are, what the learning looks like, and what tooling looks like. Hannes maps out what good data hygiene looks like not only from the machine learning perspective down to the software engineering, design, and backend engineering, all the way to the data engineering perspectives.

Take-aways

- When to invest in MLOps - Why to Invest in MLOps - Examples around Standardization - Examples around Growth Examples around Reproducibility & Repeatability What is similarity learning? Why do you use similarity learning at Digits? What’s your MLOps infrastructure for similarity-based ML? How do you serve and search your ML vectors?  Btw, when is a good moment to invest in MLOps as a growing company? How to scale ML operations with a small team? -

In this episode

Hannes Hapke

Hannes Hapke

Machine Learning Engineer, Digits Financial, Inc.

Hannes was the first ML engineer at Digits, where he built the MLOPs foundation for their ML team. His interest in production machine learning ranges from building ML pipelines to scaling similarity-based ML to process millions of banking transactions daily. Prior to Digits, Hannes implemented ML solutions for a number of applications, incl. retail, health care, or ERP companies. He co-author two machine learning books: * Building Machine Learning Pipeline (O'Reilly) * NLP in Action (Manning)

Twitter

LinkedIn

Demetrios Brinkmann

Demetrios Brinkmann

Host

Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.

Vishnu Rachakonda

Vishnu Rachakonda

Host

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.