Over the last two weeks I have been asked many times by people what they should do if the 'best performing' model as selected by AIC/VIF analysis/Forward selection doesn't include the effects that represent their hypothesis. My hot take has been to say model selection is almost always a bad idea
It so pervasive that I'm thinking I need to write a paper entitled "Why model selection is (almost) always a complete waste of time". Maybe @bobohara.bsky.social would be up on the soapbox with me here.
A statistical model is supposed to be an expression of your a priori beliefs of the system. If you have hypotheses then that, by definition, means you have a priori beliefs about the system. The way you test your hypotheses is by building a model that contains a mechanism to test those beliefs.
A statistical model is supposed to be an expression of your a priori beliefs of the system. If you have hypotheses then that, by definition, means you have a priori beliefs about the system. The way you test your hypotheses is by building a model that contains a mechanism to test those beliefs.