Welcome to the NavList Message Boards.

NavList:

A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding

Compose Your Message

Message:αβγ
Message:abc
Add Images & Files
    Name or NavList Code:
    Email:
       
    Reply
    Re: Rejecting outliers
    From: Peter Hakel
    Date: 2011 Jan 1, 11:37 -0800
    From: Fred Hebard

    All of this discussion could be informed immensely by some data and associated analyses.  Data talk.

    =============================================================================

    In the attached files I modified my earlier example as follows:

    file source_data.png:
    The six altitudes (column C, with #5 off from the linear trend by only 1 degree this time) are changed by a random number between -0.5 and 0.5 (column B).  Column D (=C+B) is entered as input into column B in average1.xls.

    file average1.xls:
    "Resolution" in F16 was changed to 0.5 degrees to roughly coincide with the spread of the random scatter in the data.

    The averaged value at UT=12:40:00 (when the "bad" altitude of 14.978 happened) is calculated to be 14.171, which is better than H0=14.318 (file altitudes.png, cells J24, J25 in average1.xls) produced by the initial non-weighted least squares fit.  In file weights.png we can see that the "bad" data point is not completely removed from consideration but its influence on the final fit is reduced by the factor of 1.536 / 4 relative to the other five "good" data points.  The number "4" you see in column BL is 1 / "Resolution"-squared.

    The difference | H0-H_fit | = 14.318 - 14.171 = 0.147 could serve as a ballpark indicator of how much uncertainty is associated with this result.

    Thus, an outlier is identified and not allowed to completely skew the final result (Peter Fogg's concern). However, unless it is really crazy like my earlier 66, it is not completely removed from the data set, either (Geoffrey Kolbe's concern).  The calculated weights express how important each data point is considered by this procedure to be (George Huxtable's concern).  I propose the | H0-H_fit | quantity as a guide to what extent the final result can be trusted, which is every navigator's concern.  Sure, this does require a computer which may not work when needed, that is always a possibility; but that is true for all machines to some extent, including chronometers and sextants.

    Happy New Year to all!   :-)


    Peter Hakel

    File:


    File:


    File:


    File:


    File: 115099.average1.xls
       
    Reply
    Browse Files

    Drop Files

    NavList

    What is NavList?

    Get a NavList ID Code

    Name:
    (please, no nicknames or handles)
    Email:
    Do you want to receive all group messages by email?
    Yes No

    A NavList ID Code guarantees your identity in NavList posts and allows faster posting of messages.

    Retrieve a NavList ID Code

    Enter the email address associated with your NavList messages. Your NavList code will be emailed to you immediately.
    Email:

    Email Settings

    NavList ID Code:

    Custom Index

    Subject:
    Author:
    Start date: (yyyymm dd)
    End date: (yyyymm dd)

    Visit this site
    Visit this site
    Visit this site
    Visit this site
    Visit this site
    Visit this site