5 minutes for 5 hours’ worth of reading

Source: https://www.reforge.com/blog/scaling-data

Many companies want to ‘start with data’ properly. So, they wait for the perfect moment — when new system is implemented, more data is collected, a team is hired, transformation is finished and so on. Unfortunately, the perfect moment never comes. Just as in life. I typically suggest: just do it. Start with a data strategy built around the imperfections rather than keep waiting forever. But the first step is always the hardest.

To make is a bit easier, check out some of the articles that caught my attention this week.

  • Scaling Data: Data Informed to Data Driven to Data…


5 minutes for 5 hours’ worth of reading

Source: https://ex.pegg.io (including a video covering the cheat sheet)

Artifitial Intelligence is making more and more decisions in our lives, so naturally we need to make sure these systems are making the right decisions for the right reasons. Especially in the high-stakes decisions related to lives, health, or large amounts of money. That’s why explainable AI is important. The cheat sheet above is from Jay Alammar and it’s accompanied with a short video that provides a fantastic overview in just 15 minutes.

And if you have another 15 minutes, I recommend you reading the following three articles.

  • What is the Open Data Ecosystem and Why it’s Here To Stay…


5 minutes for 5 hours’ worth of reading

Source: https://romans.medium.com/data-as-a-service-a-new-era-in-analytics-abe834a150db

I was very happy to read about two Czech tech companies in the news this week. First, GoodData moved into the Data as a Service space with the announcement of its cloud-native analytics platform GoodData.CN. And then the news about ProductBoard raising $72 million in a series C arrived yesterday. Well done to both!

  • The Future of Data Lineage — Beyond a Diagram: The purpose of data lineage is not to have a nice diagram to look at. The purpose is to solve data engineering problems: What is the impact of this infrastructure change? Why is this dashboard broken? Is…


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/data-for-ai/building-real-time-ml-pipelines-with-a-feature-store-9f90091eeb4

The image above comes from an article about feature stores. Moving from batch to real-time brings many challenges. One of the most painful ones is the feature engineering. Making sure features used for training a model are the same as the ones used for scoring in the real-time has caused a lot of grey hair. My business is facing this problem in one of our current assignments, I certainly hope we’ll solve it without too much stress.

A bit of a ‘pop science’ reading list with a very high frequency of the word “AI”. But it includes strawberries too.

  • Building…


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/bigdatarepublic/two-steps-towards-a-modern-data-platform-37c74e7c104b

Who wouldn’t want to listen to new songs of Jimmy Hendrix, Kurt Cobain, Jim Morrison or Amy Winehouse? Thanks to the Lost Tapes of the 27 Club it is now possible. Well, actually the songs were not written by these amazing musicians who all died at the age of 27. AI algorithm generated a string of all-new hooks, rhythms, melodies, and lyrics, which were used by audio engineers to ‘compose’ the final songs.

Hope the following reading list will offer a decent lick to finish their new solos!

  • Two steps towards a modern data platform: “Wouldn’t it be great if…


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/diaryofawannapreneur/deep-learning-for-computer-vision-for-the-average-person-861661d8aa61

Just yesterday I read an article about why not to start a data science consulting business by SeattleDataGuy. Given the fact that’s what I’ve been doing for the most of the last six years the advice came a bit late! One of the reasons — “you may think you’re going to be your own boss — but you’ll actually be working for more people than before” — made me laugh at first. Maybe because it’s true. But at the same time, I find the situation with multiple ‘bosses’ strangely liberating since it comes with variety and optionality.

This week’s list…


5 minutes for 5 hours’ worth of reading

Source: https://eng.uber.com/ubers-journey-toward-better-data-culture-from-first-principles/

Going extra mile despite 95-hours week might be the norm at Goldman Sachs, but it’s not for me. So please excuse the brevity of this week’s reading list because I had a pretty busy week and I feel I’ve shown enough solidarity with the investment bankers — and other start-up founders!

So, here it comes. Short. And concise. I hope.

  • Uber’s Journey Toward Better Data Culture From First Principles: Better data culture sounds like a sensible goal. But what does it mean and how to get there? Uber engineers laid down the basic principles of a better data culture and…


5 minutes for 5 hours’ worth of reading

Source: https://francois-nguyen.blog/2021/03/07/towards-a-data-mesh-part-1-data-domains-and-teams-topologies/

Flicking through Medium’s recommendations, I came across an article about five books every data scientist should read in 2021 by Arthur Mello. From time to time, I’m asked by aspiring data scientists for recommendations what to read. There are — of course — obvious books making many of the top 10 lists (like The Elements of Statistical Learning) but otherwise I think it’s often influenced by the background and also time entering the field. …


5 minutes for 5 hours’ worth of reading

Source: https://decision.ai

Outcomes, not outputs! This is what matters in data science. But we easily get distracted by day-to-day operations. So, a reminder is useful. Be it in the form of a funny picture like the one above or the recent book by Bill Schmarzo: The Economics of Data, Analytics, and Digital Transformation.

Happy Friday!

  • ‘Big’ Data Can Be 99.98% Smaller Than It Appears: Intuition tells us that larger samples are more reliable. But what we mustn’t forget is the importance of how the sample has been selected. To assess the saltiness of a soup, even just a spoon is enough. But…


5 minutes for 5 hours’ worth of reading

Source: https://mode.com/blog/data-visualization-experts/

Who would have guessed that one day I’ll be attending a Machine Learning conference in Prague from Zurich? Well, it happened last weekend. Obviously, it’s a different experience, no small talks and networking during coffee breaks and no beers in the evening but I was really impressed by the event. I was able to spend the weekend with the family, watch a couple of talks live and few others from recordings. As with WFH and the office post-pandemic ‘dilemma’, the future of conferences is likely to be a hybrid between physical and virtual. …

Adam Votava

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | CEO & co-founder at DataDiligence.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store