did some actual statistics (!!) with #TidyTuesday this week. It doesn't seem to be very well known, but Pukelsheim's 3 sigma rule allows you to perform robust inferences with very few assumptions. Here it detects the impact of covid lockdowns and whatever else happened in 2021. The model is an AR(1) model which uses data up to time t to predict what happens at t+1. No predictions made for first 12 time points
Code: https://github.com/jcken95/tidytuesday/tree/main/2022/2022-12-13
Paper on 3 sigma rule (paywall?): https://jstor.org/stable/2684253
If we think of a "discovery" as "a low p value for the hypothesis of a continuous parameter equal to 0" then there is no such thing as a false discovery. At least not outside a few tiny narrow areas like say particle physics.
Of course, there are binary or discrete parameters: has COVID vs doesn't have COVID for example, or has 0, 1, or 2 copies of a gene.. In those cases the result is probably more relevant.
Also I like how you call out in your paper: "Of course the number will be right only if all the assumptions made by the test were true. Note that the assumptions include the proviso that subjects were assigned randomly to one or the other of the two groups that are being compared. This assumption alone means that significance tests are invalid in a large proportion of cases in which they are used. " which is absolutely true but usually ignored!
I take issue with this though in reality: "For example, if the tests were on a series of homeopathic ‘remedies’, none would have a real difference because the treatment pills would be identical with the placebo pills."
Of course if you manufacture thousands of placebo pills and you split them randomly into two groups there would be some variation from pill to pill, and even in the average between the two groups, small though it would be.
Small though it would be, there would be some difference between the two group's outcomes. The key is that such differences would be practically speaking about the size of the fluctuating difference between people due to their different diets, different clothing, different commute times, different exposure to loud noises... etc. Virtually nothing is zero in reality. Testing to see if it is is really testing to see if the difference is too small to bother measuring.
@dlakelan @_jcken
You are not asserting that the point null is true (for NHST, or LR approaches) -you are asking what would happen if it were true. (can't do statistics without subjunctives!). If your observations are as probable under H0 as under H1 the evidence for H1 is weak -it makes no difference to that conclusion if there is a tiny effect
I just wish that experimenters understood that! they say "there is a true effect p = 0.032". What they should say is "a largely useless default model doesn't explain the data very well p = 0.032"
The question to ask is "how big was the effect of the intervention?" and the answer should be something like "the intervention effect was unlikely to be larger than ..."
"effect is unlikely to be..." means quantifying probability of effect size to be a certain size, which, means Bayesian probability.
The more you pay the more true positives that are very small you will find is basically the point. Since we have biology experiments in which people gain hundreds of millions of data points in sequence experiments, the point that there is no such thing as a point null is relevant to the real world.
Raghu put it better than i can in a mastodon post though: https://eighteenthelephant.com/2021/11/29/pushed-around-by-stars/
I can't emphasize enough how great Raghu's blog post is. he calculates the time it takes a 1 gram mass 1 light year away to forever perturb the molecular trajectories of a gas in a balloon here on earth... it's microseconds. The upshot is Children playing basketball down the street affects the rate of macular degeneration in hospital patients etc... literally everything affects everything else, the only question is how big is the effect and do we care about it.
@_jcken
Three sigma rule should be quite safe unless prior odds on H1 are low
https://royalsocietypublishing.org/doi/10.1098/rsos.140216
and
https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1529622
#statistics