Keeping Up With Data — Week 31 Reading List

5 minutes for 5 hours’ worth of reading


AI has been used extensively in the fight against Covid. It was a perfect show time for the latest algorithms and solutions. Yet, despite hundreds of AI tools being built, none of them worked. The main reason is — as it is often the case — poor data quality. Data were used from multiple sources, patched together like Frankenstein’s monster and labelled by radiologists (leading to incorporation bias) not by a result (e.g., a PCR test). For us data scientists, yet another reminder that data can make or break our models.

Digital twin, data quality of most cited data sets and data-ink ratio are on the menu this week.

Two weeks ago, I wrote about JupyterLite. But there is so many Python notebooks for data scientists. Today, I came across a website covering twenty of the most popular ones. I knew just six of them!

In case you missed the last week’s issue of Keeping up with data

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.