Parallel Computing with Dask and Coiled Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets. Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world. This talk will discuss parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems. Finally, we'll also address the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud. Check out our posts here to get more context around where we're coming from: https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4 https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8
In this episode
CEO, Coiled, Maintainer of Dask
Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations. Matthew has given talks at a variety of technical, academic, and industry conferences. A list of talks and keynotes is available at (https://matthewrocklin.com/talks). Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.
Demetrios is one of the main organizers of the MLOps community and currently resides in a small town outside Frankfurt, Germany. He is an avid traveller who taught English as a second language to see the world and learn about new cultures. Demetrios fell into the Machine Learning Operations world, and since, has interviewed the leading names around MLOps, Data Science, and ML. Since diving into the nitty-gritty of Machine Learning Operations he felt a strong calling to explore the ethical issues surrounding ML. When he is not conducting interviews you can find him making stone stacking with his daughter in the woods or playing the ukulele by the campfire.