Keeping Up With Data #67

Source: https://towardsdatascience.com/8-booming-data-science-libraries-you-must-watch-out-in-2022-cec2dbb42437

The image above comes from the first article in the list below and it’s a collection of screen shots from the documentation of SHAP library. There is a lot of pressure on knowing why machine learning models are making their decisions. What are the key factors? What drives a propensity score of a given customer? These questions are not only important to validate and understand the model. They can also provide valuable insight into individual customers and global trends alike.

Asking questions is an important skill. Not only when it comes to data and analytics.

  • 8 Booming Data Science Libraries You Must Watch Out For in 2022: Though I don’t spend my days coding anymore, I’m still reviewing a lot of code and from time to time even write a line or two. But that doesn’t make me less curious about the new libraries that are making life of data scientists easier. I’ve seen the first library on the list — SHAP — being used in multiple projects recently. It provides a great insight into key drivers of a model to data scientists and (with a little voice-over) to business people alike. I haven’t yet tried UMAP, but if it’s better than t-SNE, it’s just a matter of time. The rest of the list is certainly worth checking too. Maybe you’ll find inspiration for your next project. (Bex T. @ TDS)
  • Good Data Citizenship Doesn’t Work: The omnipresence of data and growing circle of data users are quickly forming a data society. And just like in society, also data democracy needs good data citizens and strong data leaders. Making data available to the right people at the right time — democratised data — calls for a need to make the data trustworthy. But how to do that? How to document the ever-changing data? Who should do it? How to spread the word? Luckily, there are inspirations from other places focusing on making information available. News sites, Wikipedia, Quora, or Google to name a few. What are the lessons learn drawn by Benn and Mark? Review more, document less. Let there be mess (asking questions helps uncover important issues faster than any documentation). Give voice to others, not just the data owner. (Benn Stancil @ TDS)
  • The Endless Data Buffet: An analogy between data mesh and brunch. Starting with data as ingredients, data product owners as chefs and data engineers as line cooks, to data products as buffet options, or self-service infrastructure being the kitchen, all the way to high-value business analysis as a plate full of delicious brunch food. And just like a restaurant is often a well operating business with rules and processes, there are some in the data mesh buffet. Chefs and kitchen workers are using the tools that are right for the job, they focus on their specialities. The food is available where the customers expect it, in the quantity and quality they paid for. As with any analogy, there are limits to the one developed here. In real life, the roles between customers and staff are not always clearly defined. So everyone is in a slightly schizophrenic situation. And a critical mass of ‘good data citizens’ and data leaders is needed to make it work. (The Sequel)

Let’s all be good data leaders and compel others to be good data citizens.

In case you missed the last week’s issue of Keeping up with data

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.

--

--

--

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | Interim CDO at DataDiligence.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Agile Coaching: an Agile reading from a machine learning engineer’s point of view

Data Visualization in Python: Introduction to Pygal

How to use Predictive Machine Learning in your Business?

We need a reset on how we think about the future — even with Biden and Harris in the White House

EDA: Feature Engineering and Encoding Categorical Data.

Enso 2.0 public Alpha

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Votava

Adam Votava

Data scientist with corporate, consulting and start-up experience | avid cyclist | amateur pianist | Interim CDO at DataDiligence.com

More from Medium

Keeping Up With Data #69

What Data Analysts Should Know About Product Analytics

Don’t Gloss Over Data Culture

Is data really important?