Keeping Up With Data #73
5 minutes for 5 hours’ worth of reading
We see a lot of solidarity around the World in the last couple of days. It’s great to see people helping each other without any personal benefits. In the data world, we too see a lot of initiatives where one can feel the power of community. Hugging Face is one of them. Looking for a pre-trained model? Or a data set to train your own model? Hugging Face is a place to visit.
Three articles from three industry experts. Enjoy!
- MLOps Is a Mess But That’s to be Expected: MLOps is still a very nascent field — plenty of tools, practices, and standards. These will mature slowly. Similarly to DevOps a decade ago, MLOps too is not just about new tools but requires a mindset shift. And that’s never easy and always takes a long time. So what is in the future for MLOps? According to Mihail, we’ll se a lot of investment going into monitoring of ML systems, real-time ML, or better data management. But the key will lie in adopting a machine learning mindset about products and teams — continually experimenting, continually learning. (Mihail Eric)
- Data Science Project Quick-Start: Data science is both an art and a science. Eugene’s advices allow us to save our creativity for finding a solution and not wasting it on structuring the process. Understanding the problem and its context is key. What is equally important, and often gets skipped, is formulating the requirements, constrains, and metrics. Digging into data early can save us a lot of troubles later. And standardised and automated experiment pipeline will protect our mental health during the last days (nights?) of a project, when everything is getting under scrutiny. (Eugene Yan)
- A different way to “bundle” Data Platforms: There is an ever-growing number of tools and even categories of tools in the modern data stack. Will it ever change? Will we see a consolidation soon? Petr offers and alternative approach to simplifying the management of the data stack tools focusing on workflows spanning across categories — like provisioning, observability, logging, access management. Because, as Petr puts it: “if the key workflows that need to span across various parts of the stack are integrated for a cohesive end-user experience, the number of tools and layers would be less of a problem.” Could adding more tools be a way to bundling data platform? (Petr Janda)
Winter is almost over and it brings an extra complexity for the weekend plans. It’s not an obvious skiing or cycling/hiking scenario as in December or July. But that’s a nice problem to have. Isn’t it?