Keeping Up With Data #100
5 minutes for 5 hours’ worth of reading
Data products, data as a product — these are big topics of the day. Data teams are getting inspiration from products, engineering, communication, marketing, or even whole companies.
The role of data and analytics is not yet clearly defined — like for instance the role of finance or HR — and therefore every data team is different. Their focus is different, the data they work with is different, the business problems they are solving are different, and the context in which they operate is different.
That is why every data teams need to devise their own playbook to succeed. I like to think about data teams as little start-ups. They need to raise their budgets, defend their existence, focus on high-value problems, communicate their value, drive the adoption of their products, and grow their valuation.
Can you do that with using suppliers constantly changing the format and quality of their goods? Can you not understand your customers? Can you be everything to everyone? Can you do without clear priorities? Certainly not. But that’s exactly what many data teams are doing.
This week’s reading list looks at data products, data production, data-as-a-product, and AI regulation.
- If data is a product, what is production? We are hearing about managing data as a product a lot lately. But products are meant to end in production and there is always a clear differentiation between what’s in production and what’s not. Products in production are supported, once-off experiments are not. Product teams are always keen to hear ideas for improvements, but they make the call which of them will be built and deployed (into production). Not many data teams operate like that. Many ad-hoc analyses are not properly labelled as once-off; the production is not clearly delineated. And that’s a problem. Having few dashboards or ML models, properly supported is much better than having dozens of solutions floating around without any proper management and maintenance. I’m sure it’s even in the interest of the business to prioritise quality over quantity. Having just few — but robust — analytic solutions forming the backbone of the company’s decision making and operations is what’s needed. With experiments and ad-hoc analyses explicitly marked with a ‘best before’ date. (Benn Stancil)
- Deploying Data Products at the speed of the business: Let’s move from data-as-a-product to data products. The building blocks of data mesh approach — self-service infrastructure and federated data products development, oriented around domains and owned by independent cross-functional teams — are shifting the focus from centralised infrastructure to domain oriented data products. The infrastructure and tooling enabling quick deployment of new data products (and their maintenance, support, and distribution to users) is a secondary concern. I think the centralisation and decentralisation have their pros and cons and their equilibrium should be constantly re-calibrated. Domain experts are best positioned to own the data products. But the economics of data works the best when data and analytics is being used and re-used as much as possible. (Dataception)
- The EU’s attempt to regulate open-source AI is counterproductive: “The regulation of general-purpose AI (GPAI) is currently being debated by the European Union’s legislative bodies as they work on the Artificial Intelligence Act (AIA). One proposed change from the Council of the EU (the Council) would take the unusual, and harmful, step of regulating open-source GPAI. While intended to enable the safer use of these tools, the proposal would create legal liability for open-source GPAI models, undermining their development. This could further concentrate power over the future of AI in large technology companies and prevent research that is critical to the public’s understanding of AI.” (Brookings)
That’s it for the week. Now it’s time to relax!