MLOps Roadmap

Taking ML models from notebooks to production. The engineering side of ML.

ML system design principles

The gap between “model works in notebook” and “model works in production” is where most ML projects die. Key principles:

  • Reproducibility: any result must be recreatable. Version code, data, config, and environment together
  • Testability: test data assumptions, model behavior, and infrastructure — not just unit tests
  • Modularity: separate data loading, preprocessing, training, and serving so each can change independently
  • Automation: if a human does it twice, automate it. Manual steps are where errors live

Topics

Key tools

CategoryTools
Experiment trackingMLflow, Weights & Biases, TensorBoard
Model servingFastAPI, TorchServe, Triton, ONNX Runtime
PipelinesAirflow, Prefect, Kubeflow, Dagster
Feature storesFeast, Tecton, Hopsworks
Data versioningDVC
ContainerizationDocker

The progression

  1. Jupyter notebook (exploration)
  2. Python script (reproducible)
  3. Experiment tracking (comparable)
  4. Docker container (portable)
  5. API endpoint (accessible)
  6. Monitoring + retraining (reliable)

Maturity levels

LevelWhat you haveWhat’s missing
0Notebooks, manual everythingReproducibility, automation
1Scripts + experiment trackingCI/CD, monitoring
2Automated pipelines + model registryDrift detection, feature stores
3Full CI/CD, monitoring, auto-retrainingYou’re doing well

Most teams are at level 0-1. Getting to level 2 solves 90% of production pain.

Technical debt in ML

Google’s “Hidden Technical Debt in Machine Learning Systems” (2015) identified that ML code is a tiny fraction of a real ML system. The rest:

  • Data collection, cleaning, validation — the largest time sink
  • Feature extraction and management — duplicated across teams without feature stores
  • Configuration and pipeline glue — the fragile connective tissue
  • Monitoring and testing — often bolted on as an afterthought

The paper’s key insight: ML systems have all the maintenance problems of traditional software, plus a set of ML-specific issues (data dependencies, feedback loops, entanglement between features).

Reading list

  • “Hidden Technical Debt in Machine Learning Systems” (Sculley et al., 2015) — the foundational paper
  • “Rules of Machine Learning” (Martin Zinkevich, Google) — practical engineering wisdom
  • “Reliable Machine Learning” (Cathy Chen et al., O’Reilly) — production ML systems
  • “Designing Machine Learning Systems” (Chip Huyen) — end-to-end ML system design