Show more

I've been saying this for decades. There has been a bit of progress and setbacks too,

"How bibliometrics and school rankings reward unreliable science" by
#IvanOransky et al,

bmj.com/content/382/bmj.p1887

@tomek

Kurczę, a wydaje mi się, że tak kiedyś płaciłem. Teoretycznie mógł też się zepsuć…

@tomek ekran nie jest jednocześnie “czytnikiem” zbliżeniowym?

I've always found poor overall quality of research produced by honest actors to be a bigger problem than outright academic fraud. Somehow the latter never seems interesting or surprising to me whereas the former points out to serious systemic problems in scientific formation. How do we reinstitute rigorous methodological training, genuine curiosity, deep theoretical thinking, programmatic and systematic effort, careful execution in scientific practice? Seems to be the harder problem to solve.

Could this be the paradigm shift all of #OpenScience has been waiting for?

Council of the EU adopts new principles:
"interoperable, not-for-profit infrastructures for publishing
based on open source software and open standards"
data.consilium.europa.eu/doc/d

and now ten major research organizations support the proposal:
coalition-s.org/wp-content/upl

What they propose is nearly identical to our proposal:
doi.org/10.5281/zenodo.5526634

Does this now get the ball rolling, or is it just words on paper?

petersuber  
This is big. No #embargoes. No #APCs. "The #EU is ready to agree that immediate #OpenAccess to papers reporting publicly funded research should be...

When you look out to cosmic distances, it's difficult to have any sense of 3D shapes. Take this bright galaxy, M87: Is it shaped like a ball, an egg, a pancake?
Turns out, there is now a way to tell! (1/2)
#perspective #space

@talyarkoni

Also, first general AI programs is 66yo (General Problem Solver) ;)

@lakens
Assuming of normal distribution under h0 (simply because of the CLT), can be perfectly valid, so t-tests for h0 also. But at the same time, equivalence test could be not!

@lakens @lakens
Yeah, I know your article about it. Bahrens Fisher problem is heavily discussed for years :)
But both tests have assumption of normal distribution of means. And the same problem of ignoring heavy tailed distribution /vviolation of normality / skewed distribution / heteroscedascity/ mediation / moderation. However called situation, where sample is to small to be efficiently affected by CLT.

Let me repeat, equivalence testing can’t provide conclusion, that effect is small, when it’s relatively rare comparing to sample, regardless significance of results.

@lakens @JorisMeys

Estimation effect size and CI via Welsh’s t-test assumes normal distribution of effect :) I mentioned that :)

@JorisMeys
But we never know if sample is big enough to detect rare (but strong) effect. EqTesting is easy way to underestimate sample size (it is why probably EqTesting is so popular in pharmaceutical studies).

@lakens

1) Of course, you can assume any distribution. (And that procedure is called “Neyman-Pearson theory of statistical testing”.)
‘Equivalence testing’ is procedure almost always connected to t-test. Like in your textbook (photo 1) or TOST procedure (Schuirmann, D. J. 1987) .

2) “Violations of normality mostly have very little impact on error rates”, violation of normality have biggest impact on estimation of variance, so also on error rates and effect estimation. (It’s why heteroscedasticity is so important.)

1) It’ll be easy to show how easily ‘equivalence tests’ can be very wrong, if assumptions ignores non-normality of effect (by using t-test).
I think I can make some simulation after 22:00 GMT. For now, I can show what happens to p-distribution, when effect is (very) not normal. (photo 2 - no effect, non normal distibution when h1=true, 3&4 valid use of t-test, effect big but moderated).

Schuirmann, D. J. (1987). A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. Journal of pharmacokinetics and biopharmaceutics, 15, 657-680.
link.springer.com/article/10.1

@lakens

The solution for such problem is already known for 90 years.
1) specify your model
2) test your model against probable alternatives

@lakens
Oh, my god, NO!
If we make strong assumptions about normality (Welch’s) or uniformity (Student’s t-test) of effect, as we do in equivalence testing, we can only conclude that that certain model is unlikely.

In other words, if the real effect is moderated or mediated, this procedure fails. Frequency-based statistics is very sensitive to model misspecification. It is a problem, It’s not an advantage to use it. We can’t conclude h0 because data is unlikely in specific h1.

Show more

Paweł Lenartowicz's choices:

Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.