Keeping up with data — Week 44 reading list

A curious mind wandering through the world of data

Adam Votava
2 min readOct 30, 2020
CI/CD and automated ML pipeline. Source: MLOps: Continuous delivery and automation pipelines in machine learning

The rise of MLOps continues, so I’m spending a lot of time on MLOps community slack and reading through Awesome MLOps curated list of references trying to keep up. Since this week has been marked by a road rash from an MTB ride, I had more time to read!

Enjoy the weekend with my week’s top 5 reads.

  1. MLOps: Continuous delivery and automation pipelines in machine learning: implementing ML in a production environment doesn’t only mean deploying your model as an API for prediction. Rather, it means deploying an ML pipeline that can automate the retraining and deployment of new models. (Goolge Cloud Articles)
  2. MLOps Principles: Guiding principles (versioning, testing, automation, reproducibility, deployment, monitoring) and best practices (documentation, project structure) to help reduce the “technical debt” of your ML projects. Quite a long read covering all main elements of ML project — data, ML model and code. One you will revisit again and again. (MLOps)
  3. When to Run Bandit Tests Instead of A/B/n Tests: If you care about optimisation, rather than understanding, bandits are often the way to go. Bandit algorithms tend to work well for really short tests — and paradoxically — really long tests (ongoing tests). Though the post is about marketing, one can see the applicability of the approach to testing ML models in production. (CXL)
  4. Explainable Monitoring: Stop flying blind and monitor your AI: ML systems are not deterministic and therefore traditional DevOps and business KPI monitoring is not sufficient. ML monitoring needs to address challenges unique to ML, such as model decay, data integrity/quality or outliers. (Fidler)
  5. The Rise of ML Ops: Why Model Performance Monitoring Could Be the Next Billion-Dollar Industry: Data-powered applications are more and more common, which brings a need to ensure they’ll keep performing well over time. Anything can go wrong with the underlying ML models and so solutions monitoring data, models and code might well be the next big thing. (Two Sigma Ventures)

Development and deployment of ML models is arguably getting easier. The question is: are ML engineers and data scientists ready for the associated responsibility? Opportunity is huge, mistakes will be expensive.

--

--

Adam Votava

Data scientist | avid cyclist | amateur pianist (I'm sharing my personal opinion and experience, which should not to be considered professional advice)