Keeping Up With Data #106

5 minutes for 5 hours’ worth of reading

Adam Votava
2 min readOct 28, 2022

Write programs that do one thing and do it well is one of the principles of Unix philosophy. It is based on a divide-and-conquer approach with the next principle being: write programs to work together.

One of the articles in this week’s reading list is linking it to data products, data mesh, and data contracts. But these are powerful principles that can provide guidance when creating designs or solving problems in many areas.

Even when writing SQL transformations, DataBricks pipelines, or designing an ML solution. Because we can be sure we’ll have to edit, update, upgrade, and re-use the building blocks in time. And we will appreciate the ability to work on problems in isolation from other components.

It’s been a busy week and I’m boarding a plane in few minutes, which is causing the brevity of the reading list today.

This week’s reading list looks at data valuation, data contracts, expert inputs.

  • New research puts a price on the value of financial data: “Companies often buy data to make smarter decisions — but only if the value they get from the data is greater than the price they paid for it.” The approach is specifically focusing on financial data and shows how the value depends on the buyer. Larger investors might benefit from the same data more than smaller and therefore are willing to pay more. In other words, valuing data is an economic exercise and to get the highest value for the data it needs to be used to solve highest-value problems. (MIT Sloan)
  • Data Contracts: The Mesh Glue: “With the ultimate goal of building trust on “someones else” data products, data contracts are artifacts that sits at the intersection of a a) business glossary providing rich semantics, b) a metadata catalog providing information about the structure and c) a data quality repository setting expectations about the content across different dimensions.” (Luis Velasco @ TDS)
  • Experts in-the-Loop at Stitch Fix: “Each day, a random sample of the day’s algorithmically-generated outfits shared with clients is collected and stored. Then, our styling experts evaluate — against a developed set of criteria and using a specific labeling tool — what makes a good outfit. The evaluation criteria was developed in conjunction with members of our styling field, to ensure that we were maximizing the expertise of our Styling Team Leads and their stylistic judgment based on their own expertise and deep understanding of our clients.” (MultiThreaded)

Enjoy the weekend!

In case you missed the last week’s issue of Keeping up with data

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.



Adam Votava

Data scientist | avid cyclist | amateur pianist (I'm sharing my personal opinion and experience, which should not to be considered professional advice)