DevOps transformed how software engineers deliver software by making it possible to collaborate, test & deliver software continuously.
We believe that building and deploying machine learnings models should be as easy, fast and safe as Software Engineering is with DevOps. Some people call this DataOps or MLOps. Here is our humble manifesto: to achieve DevOps for ML we need to develop systems which meet the following criteria.
Reproducibility and productivity are inextricably linked. It's difficult to be productive when different team members can't reproduce each others' work. This is harder in ML than in software because test & training data and metrics need to be versioned alongside the code and environment.
Models that are deployed without full provenance, a record of all the steps taken to create the models, can fail to be compliant, and are hard to debug. Maintaining this provenance record manually slows you down and is error-prone, so automated tooling is needed.
Concurrent collaboration – that is, collaboration without treading on each others' toes – is essential. In ML this is harder than in normal software engineering, because collaboration applies to notebooks, data, models and metrics as well as code.
You're not done when you ship. In order to continue delivering value to the business, models must be retrained and statistically monitored to compensate for model drift due to constant changes in your business environment.