October 9, 2022

Reasons why ML in Production is Hard (and Solutions to Help!) | Part 1

Published by

Author’s Bio: Robert John is a Data Scientist and Machine Learning Engineer at Condo group. His role involves model development and model deployment. He loves working on end-to-end processes of a machine learning life cycle.

The issues that arise with production-level ML solutions are quite complex and difficult to address due to the various components and moving pieces that are involved. This blog talks about issues of ML in production and gives tips on how you can solve this problems.

Miscommunication and Dependency Management Issues
Photo by Alexander Dummer on Unsplash

Although almost everyone recognizes the significance of data science and machine learning, it does not come on a platter of gold. According to a 2019 article by Venturebeat, roughly 87 percent of data science projects never make it into production. Despite the fact that the percentage should have decreased due to the increased availability of tools and open-source projects for deploying models to production, we still face significant challenges in bringing machine learning models into production. Hidden Technical Debt in Machine Learning Systems, a Google research paper, demonstrates that ML code is only a small fraction of an ML system in production. The graphic below depicts the various processes involved in putting machine learning into production, of which development is only a minor component.

Problems with deploying ML models to production are no secret and a hot topic. Instead of creating another ‘problems’ list, we’ll begin tackling these issues with this post.

MLOps Lifecycle
Figure adopted from “MLOps: Continuous delivery and automation pipelines in machine learning” | Source

To prevent a lengthy post and to have more freedom to hit the nail on the head, I decided to divide this post into parts and make it a series. In this first part, we will discuss

  1. Communication gaps between business leaders and ML teams
  2. Should You Use Jupyter Notebook In Production?

Communication Gap Between Business Leaders and ML Teams

As a data scientist in an e-commerce company, the CTO of the company asked you to increase sales by integrating ML into the company’s e-commerce website. You are perplexed because all you are familiar with is building models and optimizing the accuracies of models. You have a few questions to answer first:

– What models can increase sales for the e-commerce company?

– How do you align an increase in sales (business KPI) with machine learning metrics (model accuracies and metrics)?

In an organization, there are various stakeholders, each of whom has a unique KPI and a unique perspective on the business. For instance, in the case of an e-commerce company, you can have several stakeholders and departments such as sales, marketing, customer satisfaction, inventory, delivery, etc depending on the structure and size of the company. It can get more difficult to communicate with the ML and data team as you add more stakeholders. The marketing team wants to increase sales by gaining more customers. By increasing the number of items sold and the total price sold, the sales time wants to increase sales. The customer satisfaction team wants to increase sales by retaining more customers. Additionally, the size of the project affects communication with ML and the data team because bigger projects are more complicated and have more requirements.

Aligning business KPIs with machine learning metrics and communicating the desires of different stakeholders to the ML team are common challenges in building an ML project. The company’s goal is to make more profit which is the focus of the business leader, while the focus of the ML team is optimizing machine learning models.

Solution

To resolve the communication problem between stakeholders and the ML team, you need to align the stakeholder requests with machine learning output that is mapping business KPIs and stakeholder requests to machine learning metrics. The first step in a machine learning project is to analyze requirements and collect specific data required to solve a problem and provide answers to a question. At this point, you should meet with the stakeholders and business leaders to discuss their expectations and how you need to meet those expectations.

To answer the questions in your head, you need to know the stakeholders you will deal with. In your case as an ML engineer in an e-commerces company required to increase sales, there are many possible ways to do that. For instance, you can create a Customer service bot to improve the communication between the customer satisfaction team and the users. You can create a model to assist the marketing team in identifying the advertising channels that result in the most sales. Building a recommendation system is another smart move that might make the sales team happier because it encourages customers to purchase more things on the website.

if you decided to integrate a recommender system, how does the recommender system increase sales? Does a low Root Mean Squared Error(RMSE) mean an increase in sales? There are various processes in translating the business KPI (increase in sales) to machine learning metrics (RMSE). What user behaviour can be measured? Click-through rate (CTR) is a good metric to measure user behaviour but CTR is not a machine learning metric. Users will click on more products if you recommend the proper items to them, which will increase their chances of purchasing more items. We use a recommender system to recommend items to users. Machine learning metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) are used to evaluate recommender systems (RMSE). The assumption is that a recommender system’s RMSE impacts CTR, and CTR influences sales growth. An organization’s goals should be established before adopting machine learning. Once the objectives are defined, we must monitor and confirm that our hypothesis of RMSE -> CTR -> increased sales is correct.

Particularly in a large organization and complicated project, the product owner and scrum master could also play a vital role in the communication between ML team and other stakeholders.

Should You Use Jupyter Notebook In Production?

If you work in machine learning, you should be familiar with Jupyter Notebook. Without a doubt, Jupyter notebook is a fantastic tool for data visualization, data exploration, and machine learning experiments. I won’t go over the use and benefits of Jupyter Notebook again; instead, let’s concentrate on how to use it in production. Even though it has a lot of applications and benefits, using it in production has certain drawbacks.

😡Low support for good code versioning and git — Of course, you can add and commit your Jupyter Notebook to GitHub and other repositories, however when two or more people are working on the same Jupyter Notebook and commands like “git merge” and branching are required, difficulties arise. Because notebooks are human-readable JSON files, it’s impossible to version two separate commits. A commit overwrites the entire file. This is due to the fact that Jupyter Notebook is a REPL-based system that allows you to embed code into a web-based document that contains other information such as text, graphics, data, and so on. Converting ipynb files to other file formats with nbconvert or jupytext to save your notebook as a markdown file or scripts in other programming languages is a workaround for versioning.

😡Compatible issues with many CI/CD tools — Deploying a model in production necessitates DevOps procedures, which include script scheduling and parameterization. Jenkins, Circle CI, GitHub Actions, GitLab, and other technologies are required for CI/CD. Integrating Jupyter Notebooks into these technologies has proven to be extremely difficult. Papermill is a useful tool for setting up and running notebooks. It turns your Jupyter Notebook into a data workflow tool by running each cell sequentially without having to open JupyterLab (or Notebook). It also allows you to receive arguments in your Jupyter Notebook, which may be used to set values in the code. Notebook results can be saved as well. Another option is to use nbconvert as a command-line tool for running Jupyter Notebooks.

😡Code styling and formatting require extra effort — Jupyter Notebook is not a full IDE like PyCharm, IntelliJ or VS Code that support maintaining consistent code style and formatting with EditorConfig. Notwithstanding you can add some extensions like Black and JupyterLab code formatter if you are using JupyterLab. Despite the fact that these are not custom editable code styles, they are used in best practices for formatting.

😡Unit tests are written in cells– Testing is an essential aspect of the development process. Unit testing involves testing the response of a single component of your code. Because Jupyter Notebook executes code cell by cell, it’s impossible to structure code in an Object-Oriented programming manner or as modules. There are workarounds for many unit test frameworks that only support testing based on modules.

  • Unittest supports testing methods. You can create a class to inherit unitest.TestCase in a cell, then execute the class with unittest.main method.
  • Doctest tests methods by formatting your docstring. Your tests are included in your docstring and you can execute them with doctest.testmod method.
  • testbook is a unit testing framework extension for testing code in Jupyter Notebooks. It allows you to execute your notebook like a Python script which makes it easy to use other test frameworks like pytest.

Matt Wright’s blog, Unit testing python code in jupyter notebook, detailed how to use the framework indicated above.

😡Code reproducibility and dependencies management — When transitioning to production, managing dependencies is a major issue. Every library has dependencies, and the version of a dependency can differ from one library to the next, making it difficult to choose which version to use. For example, “pip install tensorflow” does not just install tensorflow, which is a direct dependency. It also installs transitive dependents like pandas and NumPy. Operating system, Python interpreter, and hardware requirements are also included in the dependencies. To achieve a repeatable output, you must correctly manage your code. There are a number of advanced Python packages for handling dependencies that aren’t appropriate for Jupyter Notebooks. Jupyterlab-requirements is a JupyterLab extension for dependency management and optimization created to guarantee the reproducibility of Jupyter Notebooks

😡Minimal support for distributed and parallel computing — Programming becomes more efficient with the ability to run code on several nodes or workers. Because of Jupyter Notebook’s REYL-style remote code execution, parallelizing execution has been a significant issue. pyparallel is used to allocate clusters and run tasks in parallel on Jupyter Notebook

Though it is preferable to write scripts and configuration code rather than use Jupyter Notebook, if you insist on using it, you can install the extensions and workaround described above.

Conclusion

We looked at how miscommunication between the ML team and the business affects ML in production and how to fix it in this article. We also talked about how to use Jupyter Notebook in production, despite the fact that it’s not always recommended. In the next blog, we’ll look at the complexities of ML workflows and how to manage machine learning components like code, data, and model.

Thank you very much for the reviews and feedback, Demetrios Brinkmann, Merelda Wu, Temi Adeoti, Blessing Osajie, and KP Hartman.