I just finished Stuart Ritchie's jaw-dropping book "Science Fictions" which exposes the deleterious effects of fraud, bias, negligence and hype on the constitution of scientific knowledge. Of course this is a a deplorable but all-too-familiar observation, however the book brings a new level of detail. (1/3)
A particularly enlightening insight is the estimation of the prevalence of fraud (e.g. how often do biologists fake the figures in their papers?), publication bias, p-hacking, and even numerical errors in published papers... There are public records (such as Retraction Watch), very clever tools (such as statcheck http://statcheck.io/), and tests (e.g. the GRIM test https://en.wikipedia.org/wiki/GRIM_test) (2/3)
@leovarnet I'm more pessimistic. Do you know Elisabeth Bik's work on images, especially in the biosciences? I'm worried about AI generating fakes. Real scientists, a minority of those holding the title, should find out themselves and go on.
@Waldemar Indeed, this is worrisome. But one thing I discovered from this book is 1) how difficult it is to fake a dataset convincingly (data pulled out of thin air don't have the properties we'd expect of data collected in the real world and this kind of fraud can be revealed by data forensics) 2) in practice, fraudsters turn out to be quite "careless" when they fake their data/figures... I mean, in most of Bik's cases the authors did not bother doing better than copy-pasting with at most a splicing/resizing... Far from a deep-fake :)
@Waldemar you're absolutely right, we can safely assume that other fraudsters use or will use more advanced "faking approaches". In fact, scientific fraud is much more widespread than most scientists want to believe, but at the same time there are unsuspected ways to fight against it. In a 2014 (now-retracted) Science paper, Michael LaCour faked his results by reusing the data from another survey and adding some additional jiggling. And yet two other political scientists were able to disclose the fraud by showing that the dataset showed some weird statistical anomalies.
(But of course the best solutions are systemic ones, like removing the publication pressure from the scientists' shoulders and encouraging replication studies)