Welcome to the NavList Message Boards.


A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding

Compose Your Message

Add Images & Files
    Re: Rejecting outliers
    From: George Huxtable
    Date: 2011 Jan 2, 23:12 -0000

    I'm failing to understand some aspects of Peter Hakel's approach, though it
    least it seems numerical, definable, and repeatable.
    First, though, I should say that problems arise when viewing his Excel
    output. It looks as though some columns need to be widened to see their
    contents, but it seems that the file is deliberately crippled to prevent me
    taking many of the actions I am familiar with in Excel. But I should add
    that mine is an old 2000 version, which may be part of the problem. The
    alternative .png version opens with "paint" but only allows me to see the
    top-left area of the sheet. Is that all I need?
    Now to more substantive matters-
    I don't understand how the various weighting factors have been derived, and
    why many of them are exactly 1.0000, when others are much less.
    Presumably, they depend on the divergence of each data point from some
    calculated line, which is then readjusted by iteration, but I have failed
    to follow how that initial straight-line norm was assessed, or what
    algorithm was used to obtain the weights. Answers in words rather than in
    statistical symbols would be most helpful to my simple mind.
    You seem to have ended up with a best-fit slope of about 24' in a 5-minute
    period, as I did when allowing Excel to make a best-fit, when giving it
    freedom to alter the slope as it thought fit. But the slope can be
    pre-assessed with sufficient accuracy from known information, and unless
    there is some error in the information given, such as an (unlikely) star
    mis-identification, we can be sure that the actual slope is nearer to 32',
    and the apparent lower figure is no more than a product of scatter in the
    data. This is a point that Peter Fogg keeps reiterating, perhaps the only
    valid point in all he has written on this topic.
    As a result, we could, if we wished, subtract off that line of known
    constant slope from all the data, and end up with a set of numbers, all of
    which should be roughly equal, simply scattering around some mean value
    that we wish to discover. Then the statistical task of weeding outliers
    becomes somewhat simpler.
    If you apply your procedure to an artificially-generated  data-set,
    scattering in a known Gaussian manner about a known mean (which could be
    zero), and known to contain no non-Gaussian outliers, what is the resulting
    scatter in the answer? How does it compare with the predicted scatter from
    simple averaging? I suspect (predict) that it can only be worse, though
    perhaps not by much.
    This is the way I picture it. If there is any lopsidedness in the
    distribution, then each added observation that is in the direction of the
    imbalance will be given greatest weight, whereas any observation that would
    act to rebalance it, on the other side, will be attenuated, being further
    from the trend. So there will be some effect, however small, that acts to
    enhance any unbalance, though probably not to the extent of causing
    instability. Does that argument make sense to you? It could be checked out
    by some Monte Carlo procedure.
    I presume that the proposed procedure is entirely empirical, and has no
    theoretical backing, though there may not be anything wrong with that, if
    it works.
    contact George Huxtable, at  george@hux.me.uk
    or at +44 1865 820222 (from UK, 01865 820222)
    or at 1 Sandy Lane, Southmoor, Abingdon, Oxon OX13 5HX, UK.

    Browse Files

    Drop Files


    What is NavList?

    Join NavList

    (please, no nicknames or handles)
    Do you want to receive all group messages by email?
    Yes No

    You can also join by posting. Your first on-topic post automatically makes you a member.

    Posting Code

    Enter the email address associated with your NavList messages. Your posting code will be emailed to you immediately.

    Email Settings

    Posting Code:

    Custom Index

    Start date: (yyyymm dd)
    End date: (yyyymm dd)

    Visit this site
    Visit this site
    Visit this site
    Visit this site
    Visit this site
    Visit this site