none
How to handle statistic calculations? Outliers, max, min, etc

    Question

  • Hi all,

    I'm getting into new realm for me doing statistical analysis for a group of numbers.  I'm having problems trying to figure out how to handle certain calculations.   One of my biggest problems is determining the outlier for a small list of doubles.  I have looked far and wide to find a class or function to handle outliers, but nothing seems readily available except for the higher end packages.  So, I found a formula that helps determine the outlier, but I'm not sure how to handle the data.

    what i have so far..

    protected void doGrubbsTest(double invitroMean, double invitroStDev, List<invitro.invitroDetail> thisSample)
        {
          double grubbsOutlier0 = ((invitroMean - thisSample[0].uNDF) / invitroStDev);
          double grubbsOutlier1 = ((invitroMean - thisSample[1].uNDF) / invitroStDev);
          double grubbsOutlier2 = ((invitroMean - thisSample[2].uNDF) / invitroStDev);
          double grubbsOutlier3 = ((invitroMean - thisSample[3].uNDF) / invitroStDev);
          
          List<double> grubbsList = new List<double>();
          grubbsList.Add(grubbsOutlier0);
          grubbsList.Add(grubbsOutlier1);
          grubbsList.Add(grubbsOutlier2);
          grubbsList.Add(grubbsOutlier3);
          grubbsList.Sort();
    
          double maxValue = grubbsList.Max();
          double minValue = grubbsList.Min();
        }
    

    Basically, the grubbstest helps determine which data point is considered an outlier.  I need to find which one of these double values is the max and that is to be considered the outlier.  

    Some of my problems.  When i find the max value of the list, this value is the calculated grubbvalue, I need the thisSample[x].uNDF value to be able to remove that specific number from the list. 

    I also need to determine out of the 4 samples, which two are the central results.

    Is there a better approach to this?  any help or advice is appreciated.

    Dave

     

    Monday, October 04, 2010 3:35 PM

Answers

  • I need to find which one of these double values is the max and that is to be considered the outlier.  

    Some of my problems.  When i find the max value of the list, this value is the calculated grubbvalue, I need the thisSample[x].uNDF value to be able to remove that specific number from the list. 

    I also need to determine out of the 4 samples, which two are the central results.


    Hi Dave,

    You seem already found a way to pick max/min value out.

    Not quite clear about the meaning of "I need the thisSample[x].uNDF value to be able to remove that specific number from the list.", but seems List<T>.Remove/RemoveAt/RemoveRange can help?

    As for the sentral value, you can pick the value(s) of the central index in a sorted list.

    HTH.

    Thanks. 


    Figo Fei
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com 


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    • Marked as answer by Figo Fei Tuesday, October 12, 2010 6:40 AM
    Tuesday, October 05, 2010 3:38 AM