# NavList:

## A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding

**Re: Celnav article in Ocean Navigator Magazine**

**From:**Frank Reed

**Date:**2016 May 11, 15:20 -0700

Stan, you quoted yourself from a conversation in April in which you said:

"Someone would have to do a statistical study to see if using the median works any better than using the mean, especially after outliers are removed"

Using the median is a good method for eliminating outliers, but a disciplined averaging method based on deviation to eliminate outliers (which we have discussed before) would probably yield slightly better results. Some numerical experiments I have run (see PS) support this. I would not be surprised, however, if the difference is insignificant in practice, and using the median may be easier to explain to navigation students. That would be a big plus in its favor.

You also wrote:

"Assuming a run of five sights, using the mean you would have to do five time averages, five altitude averages, and one sight reduction. Using the median you would have to do five sight reductions. Looks like a wash to me."

Well not really. I would say you would have to do 0.5 sight reductions (equivalent work) for the median method. We don't need to clear each and every sight by the full methodology. Assuming that we take five or even fifteen sights once a minute from the same location or from a moving vessel incorporating a simple correction for motion, then the computed altitudes will change with near perfect linearity. That means we only need to calculate the rate of change of the altitude and the intercepts are then just the deviations from a line with that slope. Clearing a bunch of sights taken over a short period of time to get intercepts is (almost) exactly the same thing as plotting them against a straight line based on the expected rate of altitude change. So this is computationally "cheap" even if we don't do it on a computer or similar device.

An advantage of running a line through the sights, preferably one with a pre-computed slope, is that you are no longer limited by the actual sights that you have taken. Imagine looking at your watch and realizing that it's about three minutes before 19:00 UT. You start taking sights and manage five within five or six minutes. You plot them out, run a line through them, and then you read off the altitude from the line that you would have observed exactly at 19:00:00 UT. And because there is then no interpolation, there shall be much rejoicing... In fact, that makes life so much easier, that I think I may have to incorporate it into my Modern Celestial methodology. If we could always arrange to take sights such that we can read off the altitude for the nearest six minutes of time, which makes all time values end of some multiple of 0.1, then the computation is simpler.

Frank Reed

PS: For those interested in numerical experiments, you can generate sights with outliers by using a mixed normal distribution. With probability p, typically rather close to 1, you would choose random variables from a normal distribution with some relatively "good" standard deviation (e.g. 1.0 minutes of arc in the altitudes) and mean equal to zero. And with some rather small probabilty q, such that p+q=1, you would pull your random variables from a distribution with a rather "poor" standard deviation. For example, set p=0.9 and q=0.1 (necessarily) with σ_{1}=1.0 and σ_{2}=3.0. The low probability "q" portion will yield outliers that correspond rather well to the sort of thing we see in manual celestial navigation. You can generate normally distributed random variables in modern spreadsheets using the NORMINV() function (use NORMINV(RAND(),0,3.0) to generate random data with a mean of 0 and standard deviation of 3.0) or if you prefer to "roll your own" you can use the standard Box-Muller algorithm. Using a mixed distribution like this, you can simulate celestial sights in a fashion that closely resembles real data, complete with "wild" outliers that occur with far greater frequency than found in a common normal distribution. This produces that "kurtosis" that I have written about quite a few times before...