Keeping Up With Data #97

5 minutes for 5 hours’ worth of reading

Adam Votava
3 min readAug 26, 2022

Buyout firms almost always perform technology due diligence on pure-play software companies, says the Bain’s report. The problem is that technology plays an important role for many more companies that are not necessarily software businesses.

In fact, 31% of all buyouts involved pure-play technology companies in 2021. And there are many other deals where technology is central to the value proposition.

I was surprised to see that only 9% of buyers performs tech due diligence (under which Bain also counts data and analytics). The number, while growing, is still very low given the importance technology, digital, and data play (or could play — as a value-creation opportunity) in many companies.

It is perhaps due to the fact that historically, tech due diligence was about technology and the reports were dry, technical, and felt disconnected from the investment. Recent trends are showing that tech due diligence providers increasingly focus on how technology powers the business models and operations, and how it mitigates risks.

I hope that data due diligence won’t take that long to become a standard diligence item.

To build out on the links between technology, data, and business, let’s enjoy the following three articles on today’s reading list.

  • The Rise of Data Contracts: Changes in the operational databases made by engineering teams can (and do) have massive impact on the downstream data use cases. They can result in broken data pipelines, changes in behaviour of ML models, or slowly deteriorating data quality — and consequently the trust in company’s data. Paralysing the data’s value-creation possibilities. The problem has to be fixed upstream and the solution might lie in using data contracts. The data contrats provide agreed standards between data producers (often SWE) and data consumers (data teams and ultimately the business and its customers). Just like data teams should be getting close to the business to focus on the right problems, they should also be building strong relationships with engineering to ensure the reliability and availability of data required to power the business and its operations. (Data Products)
  • Speed Running The Data Infrastructure Industry: After the cloud revolution, we have seen a lot of new tools supporting data industry and aspiring to make the life of data teams easier. But it might have became too easy and resulted in abundant or sub-optimal implementations costing companies lots of money. Wrongly designed data model will lead to long and costly data pipelines. And because it’s so easy for an eager data engineer to ingest and store data without thinking deeply about the data model of the business, the whole data infrastructure is getting incredibly complex. Which is not blocking the data scientists and analysts to create and run even more complex solutions on top of the complex data storage. It just costs more and more money. FinOps — managing organisation’s cloud costs — is rightly becoming an imperative to many. (Seattle Data Guy)
  • Down with the DAG: Scheduling data jobs is usually done by DAGs. We say when to start the first process and what is the sequence other jobs follow when the previous one is finished. It works like a domino. But wouldn’t it be nice to just say what we want? Instead of designing how it should be done? Can’t we simply define how fresh we need the final data? Can it be a day old? An hour old? Why do we have to design the whole sequence and not let some system to figure out the complexities in the background? That would create a similar experience we get at the airports — just focusing on when our flight is suppose to take off or land, not having to worry about the complexities and dependencies of running an airport. (Benn Stancil)

Let’s enjoy the last summer weekend!

In case you missed the last week’s issue of Keeping up with data

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.



Adam Votava

Data scientist | avid cyclist | amateur pianist (I'm sharing my personal opinion and experience, which should not to be considered professional advice)