SQL Server Developer Center > SQL Server Forums > Data Mining > how to predict PK attribute?
Ask a questionAsk a question
 

Answerhow to predict PK attribute?

  • Thursday, July 02, 2009 8:35 AMGuennadiy Vanine Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I am creating the DM model (engaging DT and Clustering Algorithms).

    The Key attribute cannot be marked also as Input or Predictable.

    And if I need to predict and to input on key column?
    Should I duplicate all this column(s)?




    Why cannot I predict PK


    Guennadi Vanine -- Gennady Vanin -- Геннадий Ванин

Answers

  • Tuesday, July 07, 2009 4:00 AMBogdan CrivatModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    PK is typically not interestin for predictions. Primary keys have distinct occurrences in a data set (unique values for each row), and a single occurrence is not enough to learn patterns about a column. Also, in general primary keys do not hold any significance in their values -- a primary key may be a unique integer or a Guid.

    However, if your primary key actually holds significant information (numerical values, I would guess?) then you may duplicate it in your mining model. For example, you could create a model which uses the PK column as a key, then right click the Mining Structure object and select Add Column, and add the PK column once more.


    bogdan crivat [sql server data mining] / http://www.bogdancrivat.net/dm

All Replies

  • Tuesday, July 07, 2009 4:00 AMBogdan CrivatModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    PK is typically not interestin for predictions. Primary keys have distinct occurrences in a data set (unique values for each row), and a single occurrence is not enough to learn patterns about a column. Also, in general primary keys do not hold any significance in their values -- a primary key may be a unique integer or a Guid.

    However, if your primary key actually holds significant information (numerical values, I would guess?) then you may duplicate it in your mining model. For example, you could create a model which uses the PK column as a key, then right click the Mining Structure object and select Add Column, and add the PK column once more.


    bogdan crivat [sql server data mining] / http://www.bogdancrivat.net/dm