5 minutes for 5 hours’ worth of reading

Source: https://medium.com/data-for-ai/building-real-time-ml-pipelines-with-a-feature-store-9f90091eeb4

The image above comes from an article about feature stores. Moving from batch to real-time brings many challenges. One of the most painful ones is the feature engineering. Making sure features used for training a model are the same as the ones used for scoring in the real-time has caused a lot of grey hair. My business is facing this problem in one of our current assignments, I certainly hope we’ll solve it without too much stress.

A bit of a ‘pop science’ reading list with a very high frequency of the word “AI”. But it includes strawberries too.

  • Building…


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/bigdatarepublic/two-steps-towards-a-modern-data-platform-37c74e7c104b

Who wouldn’t want to listen to new songs of Jimmy Hendrix, Kurt Cobain, Jim Morrison or Amy Winehouse? Thanks to the Lost Tapes of the 27 Club it is now possible. Well, actually the songs were not written by these amazing musicians who all died at the age of 27. AI algorithm generated a string of all-new hooks, rhythms, melodies, and lyrics, which were used by audio engineers to ‘compose’ the final songs.

Hope the following reading list will offer a decent lick to finish their new solos!

  • Two steps towards a modern data platform: “Wouldn’t it be great if…


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/diaryofawannapreneur/deep-learning-for-computer-vision-for-the-average-person-861661d8aa61

Just yesterday I read an article about why not to start a data science consulting business by SeattleDataGuy. Given the fact that’s what I’ve been doing for the most of the last six years the advice came a bit late! One of the reasons — “you may think you’re going to be your own boss — but you’ll actually be working for more people than before” — made me laugh at first. Maybe because it’s true. But at the same time, I find the situation with multiple ‘bosses’ strangely liberating since it comes with variety and optionality.

This week’s list…


5 minutes for 5 hours’ worth of reading

Source: https://eng.uber.com/ubers-journey-toward-better-data-culture-from-first-principles/

Going extra mile despite 95-hours week might be the norm at Goldman Sachs, but it’s not for me. So please excuse the brevity of this week’s reading list because I had a pretty busy week and I feel I’ve shown enough solidarity with the investment bankers — and other start-up founders!

So, here it comes. Short. And concise. I hope.

  • Uber’s Journey Toward Better Data Culture From First Principles: Better data culture sounds like a sensible goal. But what does it mean and how to get there? Uber engineers laid down the basic principles of a better data culture and…


5 minutes for 5 hours’ worth of reading

Source: https://francois-nguyen.blog/2021/03/07/towards-a-data-mesh-part-1-data-domains-and-teams-topologies/

Flicking through Medium’s recommendations, I came across an article about five books every data scientist should read in 2021 by Arthur Mello. From time to time, I’m asked by aspiring data scientists for recommendations what to read. There are — of course — obvious books making many of the top 10 lists (like The Elements of Statistical Learning) but otherwise I think it’s often influenced by the background and also time entering the field. …


5 minutes for 5 hours’ worth of reading

Source: https://decision.ai

Outcomes, not outputs! This is what matters in data science. But we easily get distracted by day-to-day operations. So, a reminder is useful. Be it in the form of a funny picture like the one above or the recent book by Bill Schmarzo: The Economics of Data, Analytics, and Digital Transformation.

Happy Friday!

  • ‘Big’ Data Can Be 99.98% Smaller Than It Appears: Intuition tells us that larger samples are more reliable. But what we mustn’t forget is the importance of how the sample has been selected. To assess the saltiness of a soup, even just a spoon is enough. But…


5 minutes for 5 hours’ worth of reading

Source: https://mode.com/blog/data-visualization-experts/

Who would have guessed that one day I’ll be attending a Machine Learning conference in Prague from Zurich? Well, it happened last weekend. Obviously, it’s a different experience, no small talks and networking during coffee breaks and no beers in the evening but I was really impressed by the event. I was able to spend the weekend with the family, watch a couple of talks live and few others from recordings. As with WFH and the office post-pandemic ‘dilemma’, the future of conferences is likely to be a hybrid between physical and virtual. …


5 minutes for 5 hours’ worth of reading

Source: https://medium.com/airbnb-engineering/visualizing-data-timeliness-at-airbnb-ee638fdf4710

I’ve been asked to provide feedback to two of my friends preparing for presentation this week. I’m always grateful when that happens — I take it as a compliment — and I try to provide my undivided attention to give honest and constructive feedback. Both of them are highly knowledgeable in their fields and so have plenty to talk about. In such situations, it’s often hard to self-edit and pull all the threads into a focused story line. But it’s so important that listeners can easily follow. I know from my own experience when trying to share valuable insights about…


5 minutes for 5 hours’ worth of reading

Source: https://pycaret.org

The opening article in this week’s reading list is a great reminder that even data infrastructure is about people. You can have the greatest infrastructure and tools but if users are not compelled that it’s solving their problems and if it’s not easy-to-use, it will always be bypassed and result in a patchwork of solutions.

This week’s list is — hopefully — ‘aggressively helpful’, fair and also a bit nostalgic.

  • Aggressively Helpful Platform Teams: Data scientists are building algorithmic solutions to business problems. But in order for the solutions to make an impact, they need to “wrestle complex, scalable infrastructure…


5 minutes for 5 hours’ worth of reading

Source: star-history via datarevenue

I’ve read a lot about data infrastructure this week. Every time I dig a bit deeper, I’m amazed by the number of new tools. How can one keep up with all the development? And not even hands on, just knowing what’s out there?

Beyond that, an article about the importance of the CDO role and one about data literacy made it on the list today.

  • Making the business case for a chief data officer: Increasing number of companies are appointing a CDO. But the results-oriented executives often question the business value added by a CDO and — consequently — data…

Adam Votava

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | CEO & co-founder at DataDiligence.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store