Freelancer: AI/ML Engineer | Data Scientist | MLOps Community Organizer | OpenClassrooms Mentor | Hacker | PhD
Programmers have always been passionate about their preferences, whether they discuss spaces vs. tabs, Vim vs. Emacs, or light mode vs. dark mode. These debates have withstood the test of time, indicating that there is a place for each solution, and no definitive argument can declare one superior over the other.
However, when it comes to programming paradigms, the arguments tend to be more fervent. Object-oriented languages have long dominated the programming landscape, championing code reusability across various projects. In contrast, functional programming has emerged as an alternative style in recent years, promising code that is easier to reason with. When delving into machine learning projects, the question arises: which paradigm is best suited for building an MLOps application?
This article aims to shed light on the benefits of both programming styles and help you determine the most suitable one for your MLOps project. We will start by introducing the two main programming styles in our industry. Subsequently, we will explore the specific requirements of MLOps projects to guide our decision-making process. Finally, I will offer my opinion on the best overall style and present a compelling trade-off known as the “hybrid style.”
A brief intro to programming paradigms
Throughout my career, I’ve had the chance of working with various programming languages, starting with object-oriented ones like C++, Java, PHP, Python, Ruby, and Groovy. Each language offers its own set of advantages and disadvantages, depending on the depth of its features:
- Real-World Modeling: The object-oriented paradigm closely mirrors real-world entities, making it intuitive for developers to model and design applications based on the problem domain.
- Modularity and Reusability: Object-oriented programming’s encapsulation allows for modular code design, promoting reusability and maintainability. This helps manage large codebases and fosters collaboration among team members.
- Rich Ecosystem: Object-oriented programming languages like Java and C# have extensive frameworks and design patterns, providing developers with powerful tools for building complex applications efficiently.
- Shared Mutable State: Object-oriented programming often relies on a shared mutable state, leading to potential bugs and issues related to mutable objects being accessed from multiple locations.
- Brittle Inheritance Hierarchies: Overuse of inheritance can lead to fragile class hierarchies, making it difficult to modify or extend functionality without introducing unintended side effects.
- Complexity and Overhead: Object-oriented codebases can become complex, especially in large projects, leading to increased development and debugging time.
Below is a diagram illustrating an MLOps application implemented with an object-oriented style. The programmer must carefully handle the object attributes, and provide getter/setter methods to control their access. While the program representation is intuitive, it is also verbose and sometimes rigid with its adherence to object-oriented principles.
As my career progressed, I ventured into functional-oriented languages such as Clojure, Haskell, and Elixir. These languages piqued my interest with their unique approach to state management and other concepts that seemed tailored for data applications.
- Enhanced reasoning capabilities: Functional programming’s focus on pure functions ensures that executions solely depend on inputs, making testing and debugging significantly more straightforward.
- Predictable Concurrency: Immutability and statelessness inherently reduce the chances of race conditions and concurrent data access issues, making it more suitable for parallel and concurrent programming.
- Simplicity in design: Functional programming requires fewer complex design patterns, relying instead on other forms of polymorphism (e.g., ad-hoc or parametric), high-order functions, and even monads to extend and fortify programs.
- Learning Curve: Functional programming can be challenging for developers who are more accustomed to imperative and object-oriented paradigms. The shift in mindset and understanding of concepts like higher-order functions and recursion may take time.
- Performance Overhead: Some functional programming constructs, such as creating many intermediate data structures during computation, may introduce performance overhead compared to optimized imperative implementations.
The diagram below exemplifies an MLOps application adhering to a functional programming style. The program layout is more straightforward due to a clear separation between data structures and operations. However, this style requires functional programming constructs such as ad-hoc polymorphism to support the addition of both new types and functions in a robust manner.
Mastering both paradigms proves to be a valuable investment, equipping developers with a diverse toolkit to design optimal solutions. Let’s now explore the specific requirements unique to MLOps projects, aiding us in selecting the best-suited programming style for this type of application.
Requirements for MLOps projects
MLOps applications present a unique blend of simplicity and complexity. On one hand, they share common concepts like datasets, models, and jobs, which can be reused across projects with slight variations. However, dealing with challenges like randomness, large data structures, and complex internal objects, such as neural networks, adds complexity to these projects. To avoid potential struggles and costly refactoring, starting MLOps applications with a well-designed foundation is crucial.
Below, we list key requirements ranked by importance for an MLOps application (in my opinion):
- Reproducibility: Ensure that your MLOps application produces consistent and reproducible results.
- Modularity: Embrace modularity by breaking down your MLOps application into smaller, reusable components.
- Configurability: Allowing program behavior changes through external configurations rather than direct code modifications.
- Extensibility: Facilitating the addition of new models and data sources.
- Keep It Simple (KISS): Keep the application simple, as not all MLOps contributors have advanced programming backgrounds.
With these requirements in mind, let’s delve into the discussion of which programming style might be best suited for developing MLOps applications.
So, which style is best?
As we explored the pros and cons of both object-oriented and functional programming in the previous section, we see that there is no single criterion that strongly favors one over the other for MLOps applications. Both styles can meet the identified requirements, which is fortunate, considering most programming languages are Turing complete and offer equivalent expressivity.
However, I do have a compelling argument. While all programming styles can be applied to develop MLOps applications, not all programming languages can effectively support both paradigms. For instance, Python, one of the most popular languages for data science projects, is best suited for object-oriented programming when building large applications. Though it can handle functions and even high-order functions, these features represent the bare minimum to support the functional paradigm. Python lacks support for key elements of functional programming, such as 1) ad-hoc or parametric polymorphism, 2) tail-call optimization, and 3) efficient immutable data structures (e.g., persistent data structures). In contrast, it excels at using subtyping polymorphism and mutability for various Python operations.
As a result, I tend to favor the object-oriented paradigm when building MLOps applications with Python, even if I prefer the functional paradigm for other application types. Building an MLOps project is not trivial, as it requires advanced and idiomatic language features to fulfill specific requirements. Nevertheless, there is a trick that can be applied to incorporate elements of both paradigms, striking a balance that leverages the strengths of each approach.
The Hybrid Style
The hybrid style aims to combine the best aspects of functional programming with the object-oriented paradigm, creating a favorable trade-off for programming languages like Python that support both styles. By embracing this approach, your code can become more idiomatic, extensible, and easier to reason with.
To implement this style effectively, consider adhering to the following principles:
- Immutable attributes: Objects should not update their attributes after initialization, treating them as read-only to avoid modifying the object state directly.
- Output-oriented methods: Each method should return its output rather than updating attributes, enabling other objects to handle modifications to the program state.
- Idempotent method calls: methods should consistently return the same output for the given inputs, akin to functional programming principles.
- Centralized imperative statements: High-level classes, like a Job class, should handle imperative statements, a concept reminiscent of the IO monad in Haskell. This ensures clear demarcation of actions that interact with the real world, such as logging or database updates.
- Leverage object-oriented other benefits: Embrace the advantages of object-oriented programming, such as subtyping polymorphism and intuitive representations when needed.
This final diagram presents an MLOps application following the hybrid style’s guidelines. On one hand, we fall back to subtyping polymorphism and class representations to support the program’s extensibility. On the other hand, we reduce the overhead of managing the program state by using and sharing read-only attributes while separating the classes that might have a side effect (i.e., jobs) from the rest of the application.
Remember that these are guiding principles rather than strict rules. By applying them thoughtfully, you can design a robust application. The MLOps Python Package was developed using these principles and can demonstrate how the hybrid style can be applied effectively to MLOps applications.
This article explored the strengths and weaknesses of two popular programming paradigms: Functional and Object-oriented. Both styles can be successfully applied to MLOps projects, considering their unique requirements. The primary selection criterion should align with your chosen programming language’s characteristics, allowing you to implement the most idiomatic solutions. For instance, opt for a functional style with Haskell or Clojure and an object-oriented style with Java or Python.
Alternatively, you can leverage the hybrid style to blend the benefits of functional and object-oriented approaches in a predominantly object-oriented language. This choice respects the language’s capabilities while catering to data applications where idempotence and parallelism play crucial roles in taking your application to the next level.
On a personal note, I find the object-oriented style of Python somewhat lacking in elegance. However, I hold Python in high regard for its adaptability to new concepts over time, such as gradual typing or asynchronous programming. To improve Python’s object-oriented style, I recommend using a toolkit like Pydantic, a remarkable library that I’ve extensively employed in designing the MLOps Python Package. It overcomes the aforementioned limitation and significantly enhances the development process.Tags: MLops, MLops Project, Programming