MLOps Battina (2019) is the combination of the terms and practices of machine learning and DevOps.
And DevOps is the combination of all phases and roles in a development team into each and every member of the team, who will in turn act as code writers, QA leaders and system programmers.
The objective of machine learning (ML) is to create models from data that can be used for prediction, classification or visualization, solving specific problems for specific customers. If you combine that with DevOps, what we have is the reproducible and seamless creation of ML workflows that can be continuously deployed to production.
Scientific computing and development, which includes data science, has been generally separated from mainstream best practices in companies; this is in part due to the fact that science works through public funding and not with a customer or new product, and also to the traditional division of science in silos.
DevOps was an attempt to break silos (Development, QA, Ops) in software engineering; MLOps tries to break silos in machine learning, and possibly, in time, in data science.
By data science I mean the academic part of it.
This course, or this material, depending on how you arrive at it, is mainly addressed at students that have been living in traditional scientific computing environments, and will try to endow them with an agile mindset that will be an useful skill if they want to jump from academia to the industry; if they are already in the industry, it will try to help them get acquainted with data science concepts and techniques that will help them jump-start a career in the coveted data engineering wage bracket.
Hope this is useful for anyone arriving here from anywhere, before, during or after the in-person summer course.
As revealed in many articles such as this one, being a successful data scientist of engineer requires a wide range of different skills.
These different parts will be studied, with different intensity, in this course; we will not devote a lot of time to math and visualization (just enough to tell you that you need to use it); obviously, MLOps is the focus of the third pillar, but we will also be spending some time in understanding why domain knowledge is absolutely necessary for the design of successful MLOps systems.
MLOps consists in the automation, within the framework of an agile team, of the following steps:
During the duration of this course, we’ll try to go through these steps, starting with the very important step 0: create valuable products and develop using an agile mindset.
The site Machine Learning Operations is a collection of concepts and tutorials about MLOps.
Battina, Dhaya Sindhu. 2019. “AN INTELLIGENT DEVOPS PLATFORM RESEARCH AND DESIGN BASED ON MACHINE LEARNING.” Training 6 (3).
Kreuzberger, Dominik, Niklas Kühl, and Sebastian Hirschl. 2022. “Machine Learning Operations (MLOps): Overview, Definition, and Architecture.” arXiv Preprint arXiv:2205.02302.
Tamburri, Damian A. 2020. “Sustainable MLOps: Trends and Challenges.” In 2020 22nd International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 17–23. IEEE.