SQL Server Developer Center > SQL Server Forums > Data Mining > Stronger correlation formula.
Ask a questionAsk a question
 

AnswerStronger correlation formula.

  • Saturday, November 07, 2009 8:54 AMbardcan Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    If I have two series of numbers, series A contains either 1s or 0s, depending on if a patient took a pill or not. Series B contains random numbers. All of the series B numbers that coincide with the patient taking a pill have an average of 100, whereas those that coincide with NOT taking a pill average to 101. There is a HUGE amount of data, so I am trying to find the formula that will show that there is a strong correlation between the two - that if the patient takes the pill, the most likely result is that their B measurement will go up by 1 point. A standard correlative coefficient shows a low correlation... around .15. Any help would be greatly appreciated.

Answers

  • Tuesday, November 10, 2009 5:36 AMBikash Dash Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Hi BardCan,
    Yes you are at the right place.
    You can definitely use Data Mining For this.

    You can go with cluster analysis and predict the value "that if the patient takes the pill, the most likely result is that their B measurement will go up by 1 point"

    So you will take the input as "patient taking a pill or not" and etc, these will be input columns.
    And Prediction Volumn will be "B".

    Go to Business Intelligence Studio and create a analyis service Data Mining project for Cluster Analysis.


    Please Vote & "Mark As Answer" if this post is helpful to you.

    Cheers
    Bikash Dash
    MCDBA/MCITP

All Replies

  • Tuesday, November 10, 2009 5:36 AMBikash Dash Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Hi BardCan,
    Yes you are at the right place.
    You can definitely use Data Mining For this.

    You can go with cluster analysis and predict the value "that if the patient takes the pill, the most likely result is that their B measurement will go up by 1 point"

    So you will take the input as "patient taking a pill or not" and etc, these will be input columns.
    And Prediction Volumn will be "B".

    Go to Business Intelligence Studio and create a analyis service Data Mining project for Cluster Analysis.


    Please Vote & "Mark As Answer" if this post is helpful to you.

    Cheers
    Bikash Dash
    MCDBA/MCITP