**Jack Kennedy 🐀** @_jcken@qoto.org · 2022-12-13T16:35:00Z

Jack Kennedy 🐀 @_jcken@qoto.org

did some actual statistics (!!) with #TidyTuesday this week. It doesn't seem to be very well known, but Pukelsheim's 3 sigma rule allows you to perform robust inferences with very few assumptions. Here it detects the impact of covid lockdowns and whatever else happened in 2021. The model is an AR(1) model which uses data up to time t to predict what happens at t+1. No predictions made for first 12 time points

Code: https://github.com/jcken95/tidytuesday/tree/main/2022/2022-12-13

Paper on 3 sigma rule (paywall?): https://jstor.org/stable/2684253

#rstats #r4ds

b50befa254b9f129.png

Dec 13, 2022, 16:35 · · · ·

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 16:41

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 16:41

Dec 13, 2022, 16:41

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@_jcken
Three sigma rule should be quite safe unless prior odds on H1 are low
https://royalsocietypublishing.org/doi/10.1098/rsos.140216
and
https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1529622

#statistics

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:40

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:40

Dec 13, 2022, 17:40

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

If we think of a "discovery" as "a low p value for the hypothesis of a continuous parameter equal to 0" then there is no such thing as a false discovery. At least not outside a few tiny narrow areas like say particle physics.

Of course, there are binary or discrete parameters: has COVID vs doesn't have COVID for example, or has 0, 1, or 2 copies of a gene.. In those cases the result is probably more relevant.

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:41

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:41

Dec 13, 2022, 17:41

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

Also I like how you call out in your paper: "Of course the number will be right only if all the assumptions made by the test were true. Note that the assumptions include the proviso that subjects were assigned randomly to one or the other of the two groups that are being compared. This assumption alone means that significance tests are invalid in a large proportion of cases in which they are used. " which is absolutely true but usually ignored!

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:52

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:52

Dec 13, 2022, 17:52

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
If, by that comment, you mean that a point null is always false, I disagree (as most experimenters would).

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:47

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:47

Dec 13, 2022, 17:47

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

I take issue with this though in reality: "For example, if the tests were on a series of homeopathic ‘remedies’, none would have a real difference because the treatment pills would be identical with the placebo pills."

Of course if you manufacture thousands of placebo pills and you split them randomly into two groups there would be some variation from pill to pill, and even in the average between the two groups, small though it would be.

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:49

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 17:49

Dec 13, 2022, 17:49

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

Small though it would be, there would be some difference between the two group's outcomes. The key is that such differences would be practically speaking about the size of the fluctuating difference between people due to their different diets, different clothing, different commute times, different exposure to loud noises... etc. Virtually nothing is zero in reality. Testing to see if it is is really testing to see if the difference is too small to bother measuring.

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:59

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:59

Dec 13, 2022, 17:59

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
You are not asserting that the point null is true (for NHST, or LR approaches) -you are asking what would happen if it were true. (can't do statistics without subjunctives!). If your observations are as probable under H0 as under H1 the evidence for H1 is weak -it makes no difference to that conclusion if there is a tiny effect

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:35

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:35

Dec 13, 2022, 18:35

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

I just wish that experimenters understood that! they say "there is a true effect p = 0.032". What they should say is "a largely useless default model doesn't explain the data very well p = 0.032"

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 18:02

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 18:02

Dec 13, 2022, 18:02

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
See Delampady and Berger 1987 for formal treatment

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 18:12

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 18:12

Dec 13, 2022, 18:12

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
As Stephen Senn has pointed out. there are some sorts of experiments where a point null would not be sensible. but in very many, it is IMO. Whenever you compare an intervention with a control, it makes perfect sense.

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:55

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:55

Dec 13, 2022, 18:55

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

The question to ask is "how big was the effect of the intervention?" and the answer should be something like "the intervention effect was unlikely to be larger than ..."

"effect is unlikely to be..." means quantifying probability of effect size to be a certain size, which, means Bayesian probability.

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 19:17

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 19:17

Dec 13, 2022, 19:17

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
Most people want to know whether the observed effect could plausibly have arisen by chance. I agree that that's a Bayesian question, but even a likelihood ratio, without Bayes at all, shows the problem with p value approach.

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 19:19

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 19:19

Dec 13, 2022, 19:19

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken
The reason that I like the LR approach is that I don't think that experimenters will ever accept made-up prior distributions

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:54

**David Colquhoun on Bluesky** @david_colquhoun@mstdn.social · Dec 13, 2022, 17:54

Dec 13, 2022, 17:54

David Colquhoun on Bluesky @david_colquhoun@mstdn.social

@dlakelan @_jcken

That objection seems to be the ultimate pedantry!

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:25

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:25

Dec 13, 2022, 18:25

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

The more you pay the more true positives that are very small you will find is basically the point. Since we have biology experiments in which people gain hundreds of millions of data points in sequence experiments, the point that there is no such thing as a point null is relevant to the real world.

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:27

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:27

Dec 13, 2022, 18:27

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

Raghu put it better than i can in a mastodon post though: https://eighteenthelephant.com/2021/11/29/pushed-around-by-stars/

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:44

**Daniel Lakeland** @dlakelan@mastodon.sdf.org · Dec 13, 2022, 18:44

Dec 13, 2022, 18:44

Daniel Lakeland @dlakelan@mastodon.sdf.org

@david_colquhoun @_jcken

I can't emphasize enough how great Raghu's blog post is. he calculates the time it takes a 1 gram mass 1 light year away to forever perturb the molecular trajectories of a gas in a balloon here on earth... it's microseconds. The upshot is Children playing basketball down the street affects the rate of macular degeneration in hospital patients etc... literally everything affects everything else, the only question is how big is the effect and do we care about it.

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…