Keeping Up With Data — Week 28 Reading List

5 minutes for 5 hours’ worth of reading

Adam Votava
3 min readJul 16, 2021

As deepfakes are becoming more reliable, detecting whether an image is real or not can be a challenge. Researchers at Facebook and Michigan State University are now going a step further by working on methods that are trying to understand what generative models have created a deepfake. Model parsing method, outlined in the caption image, is estimating the parameters of a model used to generate a deepfake. It is somewhat conciliating to see such a detective method (even involving fingerprints) being developed.

  • Effortless Distributed Training of Ultra-Wide GCNs: Deep learning methods have seen a great boom recently. Most of the methods were aiming at Euclidean data (convolutional networks, transformers). For many problems, it is more intuitive to use graph structures. Think of social networking or chemistry. An example of a deep learning method for graphs is a graph convolutional network (GCN). The model is not very efficient and can’t scale easily to large networks. But here comes a framework — graph independent subnetwork training (GIST) — for effectively breaking down a large GCN to smaller ones, training these in parallel and aggregating their updates to a full model. ( @ tds)
  • Data as a product vs data products. What are the differences? These terms have been used a lot recently. So, what’s the difference between them? A data product is “a product that facilitates an end goal through the use of data”. Dashboard, restaurant recommendations or a self-driving car are all examples of data products. Data as a product is “the result of applying product thinking into datasets, making sure they have a series of capabilities including discoverability, security, explorability, understandability, trustworthiness, etc.” We’ll be hearing these terms for a while — so make sure you won’t be confused by them! ( @ tds)
  • ‘Wisdom of the crowd’: The myths and realities: It’s been over hundred years that people noticed the essence of the wisdom of crowds: their average judgement converges on the right solution. There are many examples of when a crowd was right, but there are also plenty when it was horribly wrong. What are the properties of the crowd that tend to be accurate? Independence and diversity. You want people with very different opinions and who aren’t influenced by one another. And guess which one is more important for accuracy! Diversity, or independence? (BBC)

It’s been raining in Switzerland the whole week, which has raised the levels of lakes and rivers to dangerous levels and we already seeing many ‘unavoidable’ flooding in the lower areas. Let’s hope the weather will turn better soon and we’ll be able to enjoy a summer (almost) unaffected by Covid.

Thanks for reading!

Please feel free to share your thoughts or reading tips in the comments.

Follow me on Medium, LinkedIn and Twitter.



Adam Votava

Data scientist | avid cyclist | amateur pianist (I'm sharing my personal opinion and experience, which should not to be considered professional advice)