Follow

'None of these software developers has provided information on how effective their algorithm is relative to visual screening by someone with a trained eye for detecting image manipulation. I have written about the need for transparency about the effectiveness of these algorithms, so that users are informed about their capabilities and limitations.'

retractionwatch.com/2024/08/12

@cyrilpedia One aspect mentioned is the need to archive research data at the institution. Even though we have this as an official policy at our institute, I find it frustratingly hard to run after team members and get them to archive their raw data post-publication (which is mostly programs and calculations, we are a theory group). Most have to be reminded many many times.

Any thoughts on this? What are your experiences?

@FMarquardtGroup I think this is a general problem - there are people tinkering with workflows that go from e-lab books to figure source data, which I think are the future. The problem is the transitional period, where the backlog has to be properly archived, but hopefully moving forward this will be a lot less painful. As an editor, I have had to send back papers because the authors could not send source data requested by reviewers. One problem is what to do with data that takes up a lot of space, like live imaging and so forth. As most things in science, I think the quickest way to change is to have funders request and monitor data deposition.

@cyrilpedia I recently reviewed a paper by a very well-known AI company at a very well-known high-impact publisher, and they refused to reveal any source code. Even though anyone else has to do that and there is specifically a question about this when you submit. They just wrote they would supply 'the code or (!) pseudocode' 'on request', but that is a lame promise (and obviously even the referees were not shown the code).

@FMarquardtGroup It's absurd how far some of these things get - like the Surgisphere papers in NEJM & Lancet, journals need to do a better job. But going back to your original post on this thread, so do institutions.

@cyrilpedia Frankly, even when the data ist right there in the form of jupyter notebooks etc, it is still hard to get people to invest half a day to do it... 🧐

@FMarquardtGroup @cyrilpedia
HEP and Astro seem to have embedded data sharing in their scientific cultures, presumably out of necessity, and they devote resources to developing and maintaining infrastructure to facilitate it. Smaller, independent groups lack the hierarchical organization of HEP and Astro—which I like!—but it can make it hard to develop consensus on what to share and how. The Turing Way has some good resources.

@turingway

book.the-turing-way.org/index.

@FMarquardtGroup @cyrilpedia @turingway
With time I’ve been able to train people in my group to put everything in git repos and develop more-or-less reproducible analysis pipelines. Even with this, it’s not obvious that anyone outside would be able to follow what we did, since that can require additional documentation. It’s a work in progress but it’s getting easier.

@jsdodge @FMarquardtGroup @cyrilpedia @turingway though with astro is also depends on which data. Bug observatories and organs yes, but SO much of the extremely exciting data from smaller telescopes is not public and there are no good archives for it...

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.