Stats Whizz(es)???

J

I know (well I have been told) it is wrong to delete non-significant independent variables from a regression analysis and I have a basic idea of why (e.g. it leads to exaggeration of remaining effects). But I am writing up my modelling process in minute detail for my dissertation and I was wondering if anyone has a full explanation of why we should not delete non-significant variables, or perhaps you have some references?

Thanks
Jim ,-)

Avatar for sneaks

I would say it depends on what those variables are. If they are in your hypothesis, or you are controlling for them, because previous literature has said they are important/there will be a relationship then it is important to leave them in.

However, if you are adding in a load of control variables e.g. age, gender etc. that are not identified as having any link with the outcome in ther literature, or there is no real reason you are doing it (apart from to show off you can do hierarchical regression - like my students seem to think is necessary), then I'd take them out.

Of course its really important to check for multicollinearity - to make sure they are truely non-significant and you're not missing a potentially significant effect because of supression.

16382