Keeping Up With Data — Week 25 Reading List
The image above comes from a book by Pedro Domingos. The author argues the ‘Master Algorithm Hypothesis’: All knowledge — past, present, and future — can be derived from data by a single, universal learning algorithm. And calls for a unification effort between all five tribes of ML to come together and create the master algorithm. As a practitioner, I have an advantage of staying out of philosophical differences of the five tribes and rather use the fruits of all their research. But the idea of a master algorithm is certainly very appealing.
Importance of the problem formulation, introduction to an evaluation store and an example of ML supporting human decision makers are on the list this week:
- Data before models, but problem formulation first: “The way you represent your problem is more important than the choice of ML algorithm you throw at your problem”, says Christoph Molnar. So how to formulate the data science problem for a given business problem, say a churn prevention? The article suggests starting with the end in mind — what do we plan to do with the prediction. How will the model output be used and interpreted? Then we have to choose the right target and finally select the data. (Brian Kent @ TDS)
- The Only 3 ML Tools You Need: The ML tooling landscape keeps growing. What are the three fundamental tools needed? Feature store; model store; and evaluation store. The first two terms have been around for couple of years, yet they still feel very new to many people. The third one is being introduced by a start-up building such a product. And what is the evaluation store for? Surfacing up the model performance metrics, monitoring data drift and providing a platform for model A/B testing to name a few functionalities. (Aparna Dhinakaran @ TDS)
- Supporting content decision makers with machine learning: Netflix has so many series and movies available for almost 200M users in over 190 countries. Selection of content is a creative decision. How is ML supporting the decision makers? By providing comparable titles or estimating audience size. The solution is leveraging transfer learning, embedding representations (both for tags and countries), natural language processing (applied to the summary), and supervised learning. Apart from the technical appeal, it’s imho a great example of using ML to augment the powers of human experts. (Netflix Technology Blog @ The Netflix Tech Blog)
A lot of driving ahead of me this weekend as we are going to Prague for a wedding. Covid certificate loaded in my Swiss covid cert app so all ready to go!