MLOps course for Nova-Lisbon. Index | Github Repository

A few guiding principles

In general, most machine learning course tend to focus on a single example, single technique, single language. This leads to low engagement by the student, low adaptability of written code, and also a steep learning curve if the student needs to move to industry. My main intention here is to try and bridge the gap between industry and academic ML by gathering best practices in a MLOps (Kreuzberger, Kühl, and Hirschl 2022) course that will include the main concepts related with it.

These are a few principles behind this course, my vision on data science/machine learning (and scientific software development in general) and how it is taught. They are written here explicitly for a bit of background on the rest of the material.

  1. No single language can span all necessary techniques needed to become a developer (in data science or anywhere else). Same applies to any tool; you need to have the skill to choose the right tool for the job, not a single tool.
  2. Reproducibility implies following best practices in DevOps, like configuration in the environment
  3. Always add value to the customer by never losing track of the problem that is being solved.
  4. Reproducibility begs free software; if a MLOps workflow is tied to specific companies and tools reproducibility will suffer (or simply be impossible).
  5. MLOps is DevOps, which is about acquiring knowledge to define infrastructure and tests for deployed applications.
  6. When in doubt, be agile.

Kreuzberger, Dominik, Niklas Kühl, and Sebastian Hirschl. 2022. “Machine Learning Operations (MLOps): Overview, Definition, and Architecture.” arXiv Preprint arXiv:2205.02302.