Keeping Up With Data #85
5 minutes for 5 hours’ worth of reading
Yesterday was one of those days when nothing goes right. At least the reading list was fully in my control! I’ve been discussing football analytics a lot lately, and the phrase “Moneyball for football” was mentioned very often. So I wasn’t surprised seeing David Langer’s post about Moneyball on LI. But David wasn’t talking about football. He was using the movie as a reference when talking about data literacy or data culture. How a data transformation clashes with existing culture, requires a senior sponsor, everyone makes all the effort to proof it won’t work, and overall it’s a bumpy road towards a competitive advantage.
Data quality, data products, and decision-driven thinking are on today’s menu.
- The Existential Threat of Data Quality: Poor data quality results in incorrect decisions, complex processes, and inefficient use of resources. What are the major causes of data quality according to Chad? One is upstream quality — when the source systems are generating data never intended for analysis, and software engineers making changes in the production database often with fatal impact to analytics and ML. The second is downstream quality, which refers to divergence of skillsets of modern data consumers and requirements to conduct analysis and ML at scale. Data scientists and data analytics are trying their best to create business value but by doing so they are often increasing the complexity of already very complex data flows. No matter how many new categories will be introduced to the modern data stack, they won’t solve the problem. But as the article concludes, there is an answer that is fundamentally cultural and primarily a problem of collaboration: Treating data as a product. (Data Products)
- Introducing the Data Product Development Canvas (Version 1.0): We are hearing a lot about data products. Bill describes them as a “category of domain-infused, AI/ML-powered apps designed to help non-technical users manage data and analytics-intensive operations to achieve specific, meaningful, and relevant business outcomes.” The data product development canvas is a tool helping business and data teams to facilitate the process of designing a data product. It starts with the business problem and then focuses on measures of success, benefits from addressing the problem, impediments of implementation or operations, and nine other topics. But a success of data products is always dependent on the organisation’s data management and governance. What can help with that? SPECTRE, just like in James Bond movies. But this Special Executive should focus on Collaboration, Transparency, Reuse, and Economics of Data instead of Counter-intelligence, Terrorism, Revenge and Extortion. (Data Science Central)
- Be Decision-Driven, Not Data-Driven: Should a decision-making culture be data-driven or decision-driven? Researchers argue that the latter is better as it ties data and analytics efforts to decisions about key business problems. How does decision-driven approach differ from data-driven one? It starts with questions, not data; the projects are led by decision-makers, not data scientists; it ponders unknowns more than knowns; it looks wide first, then dives deep; it builds new data boxes; spots and reduces bias; and focuses on the future, not the past. (Mark Palmer @ tds)
Two movie references in the post present an interesting dilemma — should I watch Moneyball, James Bond movie, or something else over the weekend?