Stratified random sample & Logistic regression


  • Hi

    When generating a stratified random sample one encounters the following scenario:

      Apples Oranges      Apples/Ranges %
    Population 10 20 50.00%
    Sample 3 2 150.00%
    Weight 3.33 10.00  

    One reason for the use of the stratified random sample may be due to the fact that the number of Apples in the original population is very small. As a result, one may need to "stack" the sample in order to improve the predictive results.

    When one runs the sample into the logistic regression algorithm it will not have the advantage of knowing the distribution of the Apples to Oranges within the original population. I am aware that some logistic regression algorithms can take into account a "weighting" in order to take into account the original population distribution.

    I see that there was an earlier post that had a similiar question that related to the use of a "weighting" with clustering and neural network/logistic regression algorithms.

    However, is anyone aware of a work around to the scenario above?


    Sunday, April 25, 2010 2:57 PM