Keeping Up With Data — Week 35 Reading List

Source: https://www.modzy.com/reports/gartner-hype-cycle-for-data-science-machine-learning/

Gartner’s hype cycle for data science and ML, shown above, brings plenty of terms we’ve been hearing for a while and couple of new ones, too. Gartner is often coining or popularising new terms, some of which are understandable — like ‘small and wide data’ — others need to be constantly googled (at least by me)— like ‘citizen data science’ or ‘X analytics’. Another I find slightly confusing is the co-existence of ‘MLOps’ and ‘ModelOps’ in the picture. But I guess it says a lot that the ‘innovation trigger’ stage is full of terms, while the ‘plateau of productivity’ is not.

While sometimes thought leaders seem to be complicating simple things, data is generally about simplifying complex reality — as can be seen in the following articles.

  • Simpson’s Paradox and Interpreting Data: Data as a finite representation of a very complex real world and will never be a perfect reflection. Intuition behind what’s missing in the data (but should be included) is the art of data science. Simpson’s paradox states: A trend or result that is present when data is put into groups that reverses or disappears when the data is combined. The reason for this is so called ‘lurking variables’, which split the data into multiple distributions. They are difficult to find. And the decision to look at the data together — or by groups — is entirely situational. People sometimes consider data as an absolute truth. ‘Data don’t lie’, they say. Well, what if an important assumption is not met? Be careful to draw conclusions for a complex reality based on findings from a simple reflection. (Tom Grigg @ TDS)
  • The Role of AI in HR Decision Making: Is there anything more complex than people? In such complex environments — like organisations — it’s difficult to image a fully autonomous AI making decisions. But luckily, it doesn’t mean that HR can’t leverage AI for a wide range of decision making. Instead of automation, we should think of augmentation. Data-augmented decision making combines ‘could’, ‘should’ and ‘would’ questions. The first two can be answered with data. Could we fill in a position with existing talent? Should we do it? The third type — Would the person be happy to transfer? Would it be a good fit? — not so much. But that’s the complexity of HR that we need to take into account. (myHRfuture)
  • Pseudo-R²: A Metric for Quantifying Interestingness: In case of linear outcomes, the common measure (by statisticians) of interestingness in ‘variance explained’ — often described by R². But what to do in the case of non-linear outputs (e.g., “yes” or “no”)? For instance, what splits of an overall conversion rate do we consider most interesting? By device? By campaign? By country? By gender? And how can we quantify that? The suggestion is to use McFadden’s pseudo-R². Mostly because it balances variation with composition. Pseudo-R² is low when the groups explain no variation (in conversion rates) and also when one of the groups is significantly larger. Just as intuition tells us that the most interesting split is the one with proportional sizes of the groups with largest differences between the conversion rates. (Heap blog)

Apart from reading these (and many more) interesting articles, I’ve also published a piece about challenges of data adoption pair with tips on how to overcome them.

Until next week!

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.

--

--

--

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | Interim CDO at DataDiligence.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Knowledge Graphs: Origins, Inhibitors and Breakthroughs

The Analytics Setup Guidebook: Build Scalable Data Analytics and BI Stacks in the Modern Cloud Era

The Analytics Setup Guidebook

Best Free Online Resources for learning Data Science

Tableau: Is It The Business Intelligence Solution For You?

Selecting the Right FoamMattress https://t.co/TpgIlLQeyH

Data Wars — Epilogue

Distribution: Art of Arranging Random Data

6 myths about refuelling — tackled with statistics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Votava

Adam Votava

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | Interim CDO at DataDiligence.com

More from Medium

Keeping Up With Data #58

Data Maturity Model — How to know if your organization is ready?

Is data really important?

We Have A Discord Server!