NavList:
A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding
Re: Rejecting outliers: was: Kurtosis.
From: Geoffrey Kolbe
Date: 2010 Dec 31, 22:21 +0000
From: Geoffrey Kolbe
Date: 2010 Dec 31, 22:21 +0000
George wrote: >The threadname is changed once again, from "kurtosis" (a mathematician's >word far beyond the vocabulary of navigators, which displays Frank's >erudition) to the more familiar "Rejecting outliers", which is what the >discussion seems to be really about. > >I was trying to discover exactly what Peter Fogg himself was actually >claiming his procedure could accomplish. and >And I used the word "magic" to describe that procedure, because nowhere, >that I can recall, has Peter Fogg explained, in numerical terms that we >might agree on (or otherwise) what his criteria are for accepting some >observations and rejecting others. I have to say that I share George's disquiet about the notion of rejecting outliers simply because they do not seem to fit with the other data. Perhaps it is that, like George, I have a background as an experimental physicist, and that the notion of rejecting some data simply because it does not sit neatly with the rest of the data is an anathema. Experimental data is usually messy and experience shows that a lot can be learned from consideration of the possible causes of outliers. Simply ignoring outliers as "bad data", without which the data set would look a lot prettier and be a lot more impressive in the publication, can come back to haunt one in the end when someone (usually oneself) repeats the experiment.... Frank said that the rejection of outliers was quite acceptable and directed me to look (for example) at Chauvenet. I promised I would and I did. (Volume two, page 558, "Criterion for the rejection of doubtful observations") It seems to be a Chi Squared test based on two or more purely random, Gaussian, distributions. Chi Squared tests are useful if the data cannot be repeated - such as for observations of a rare astronomical phenomena or a space borne experiment - and you are trying to wring the last bit of precision from the data. But applying such a statistical sledge hammer to a set of five or six sextant altitude sightings is - I respectfully submit - hardly worthwhile. The navigator's time would be better spent taking another round of sights to force better precision on the mean than applying a statistical eraser to doubtful data. Even if taking more sights is not practical, outliers should not be discarded unless a good reason presents itself as to why they should be discarded. The consequence may be a rather more open cocked hat or a fix of somewhat looser precision than one would like. But better that than discarding "bad data" and risk a false sense of security from the resulting tight fix. Geoffrey Kolbe