Coffee Sessions #69

Building for Small Data Science Teams

In this conversation, James shares some hard-won lessons on how to effectively use technology to create applications powered by machine learning models. James also talks about how making the "right" architecture decisions is as much about org structure and hiring plans as it is about technological features.

Take-aways

- MLEs shouldn't just "meet the data science team where they are"...it's appropriate and valuable to ask data scientists to learn new technologies. They're intelligent, creative people, and they'll do just fine. - Organizations that are early in their MLOps journey should prioritize systems that remove friction in training/experimentation, even if it means model deployment continues to suck for a bit. - People in your organization without hands-on machine learning experience might have a single idea of what "model deployment" means...it's an MLE's job to educate them on the tradeoffs and different architectural and statistical considerations between on-demand scoring (i.e. model-as-a-REST-API), streaming scoring, and batch scoring. - Data science teams should develop and own internal libraries. Enforcing team agreements through the use of internal libraries is more reliable than any "we all just agree to remember to do this" mechanisms like wiki pages, Google Docs, etc. Libraries also are a way to solidify hard-won knowledge, so data scientists don't end up solving problems that others have already solved. Oh, and they make dependencies explicit, which is sweet :)

In this episode

James Lamb

James Lamb

Sr. Machine Learning Engineer II, SpotHero

James Lamb is a machine learning engineer at SpotHero, a Chicago-based parking marketplace company. He is a maintainer of LightGBM, a popular machine learning framework from Microsoft Research, and has made many contributions to other open-source data science projects, including XGBoost and prefect. Prior to joining SpotHero, he worked on a managed Dask + Jupyter + Prefect service at Saturn Cloud and as an Industrial IoT Data Scientist at AWS and Uptake. Outside of work, he enjoys going to hip hop shows, watching the Celtics / Red Sox, and watching reality TV (he wouldn’t object to being called “Bravo Trash”).

Twitter

LinkedIn

Demetrios Brinkmann

Demetrios Brinkmann

Host

Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.

Adam Sroka

Adam Sroka

Host

Dr. Adam Sroka, Head of Machine Learning Engineering at Origami Energy, is an experienced data and AI leader helping organizations unlock value from data by delivering enterprise-scale solutions and building high-performing data and analytics teams from the ground up. Adam shares his thoughts and ideas through public speaking, tech community events, on his blog, and in his podcast.