NavList:
A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding
Re: Rejecting outliers: was: Kurtosis.
From: George Huxtable
Date: 2011 Jan 2, 17:27 -0000
From: George Huxtable
Date: 2011 Jan 2, 17:27 -0000
Marcel Tschudin wrote- | | For any symmetrical distribution mean, median and mode are identical. | If you have good reason to believe that the measured data are expected | to be symmetrically distributed, the median can be used. A | considerable difference between mean and median indicates either the | existance of outliers or a the possible existance of a skew | distribution. In "normal" datasets there is no a great difference | between the mean and the median. In small datasets an outlier | contributes too much. One outlier within e.g. 6 data contributes 17% | (if not weighted) whereas the outlier may in reality have a much lower | probability. The median thus helps to "correct" the influence of | outliers in small datasets. It is true that in a symmetrical distribution, the mean and the median are the same, at the centre of symmetry. But we're not discussing the value of the answer, but the SCATTER in that value. And that has sent me to an oldish textbook, M J Moroney's "Facts from figures" (1951). Which reminds me that in a Gaussian distribution, , though the standard error of the mean is sigma / root-n, the standard error of the median is 1.25 x sigma / root-n, where sigma is the standard deviation of the individual observations, and n is the number of observations. So, with a Gaussian distribution, if you take the median instead of the mean, then you will need to take 1.25-squared more observations to get as good an answer. But really, you'll need someone who knows much more about statistics that I do, to explain further. George. contact George Huxtable, at george{at}hux.me.uk or at +44 1865 820222 (from UK, 01865 820222) or at 1 Sandy Lane, Southmoor, Abingdon, Oxon OX13 5HX, UK.