"how to treat outliers" in statistical analysis

Z

======= Date Modified 11 Aug 2011 08:32:51 =======
I have just got my thesis data in and started to clean it up, but found outliers in many of the variables. So far my data looks slightly weird without treating, showing some strange patterns. I suspect that it is maybe due to the outliers. I am thinking using 'truncation' method, wherein extreme scores are recoded to highest or lowest reasonable scores. However, I am not entirely sure about it and wonder how I would report in a manuscript or so. It would be really great to see some comments about this. Thanks!

K

Hi
First, I would make sure that you objectively define your outliers. There is a simple rule of thumb for identifying possible outliers: anything below Q1-1.5*(Q3-Q1) and anything above Q3+1.5*(Q3-Q1), where Q1 is the 25th percentile and Q3 the 75th.

Then, I would just re-run the analyses with the identified outliers removed and refer to it as a 'sensitivity analysis' to see whether it affects your conclusions.

Good luck.

Z

@Kbara, thanks for your reply,I like the term "sensitive analysis". I used the spss explore function identify the outliers, but instead of deleting the data, I substituted them with closest to normal range scores, one of the most sensible methods I read from several statistic books. I do have quite a few outliers for almost all my variables though, the change makes the data look more 'normal', but I have never heard anyone talking about it in their paper before. I am wondering if there is something wrong with my measures, or it is the nature of social science. I guess I am worrying about how to defend what I am doing to the data. would be really appreciated if I can see more comments about this matter. Thanks

(up)(up)

H

Before you alter the values, check whether any of them could be coding/typing errors.

Make sure you document all the changes you make and record this when you write up.

19093