What is traing data and testing data in SSAS Data Mining RRS feed

  • Question

  • I am trying to understand data mining concept,its algorithms and different mining models.

    Was able to create data mining model successfully but not understanding training data and testing data while creating
    mining structure in SSDT in SSAS.

    Thanx in advance


    Friday, June 2, 2017 3:14 PM


  • Training and testing data are a subset of all the data used to create and analyze a data mining algorithm.

    The training data is used to "train" the model to make a prediction. Then, the testing data is used with the model to see if the training of the model made correct predictions.

    So, if the training data said 1+1 = 3, then the testing data showed 1 + 1 does not equal 3 (2), then the model incorrectly predicts the result in that scenario.

    This is all part of statistics and predictions. There is always a probability that the model predicts wrong results all though most of the time it predicts correctly - degrees of variation.

    Think about the US election. It was predicted that Hillary would win certain states. But all the people predicted to vote for Hillary did not turn out in numbers like the people voting for Trump. 

    It is science, but there are factors that cause invalid predictions.

    Thomas LeBlanc twitter ( @TheSmilingDBA )

    Friday, June 2, 2017 3:26 PM