locked
Parent Child relationship column hint? RRS feed

  • Question

  • I've got a dilemma which I hope someone has a solution to. 

    Let's say we're building a data mining model to predict aircraft reliability.  In the training table we've got a column (among many others) with a unique aircraft ID, and then a column for the type (737,747) and then a column for the series (100,200,300).  I.E. A 737-800 series would be "737" and "800". 

    There is in essence a parent-child relationship between these 2 columns.  737's should share a common set of reliability factors, and then those factors might be further defined by the series number (for instance, the 737 might have very reliable radar except for the 500 series).  The series is analogous to what model year a car is.  What I want to make sure doesn't happen is for the system to correlate a 747-400 and a 737-400 because they are the same series.  They are totally independent if the model number is different.

    My only idea was to merge the columns and have a single value "737-100".  But it would seem then that the model won't have any idea that a "737-100" and "737-200" should have a lot more in common than a "737-100" because the values will be completely different.

    I was hoping to find some sort of parent-child hint in the column properties but found none.

    What solutions have other people tried?  It sure seems that there should be an elegant solution for something like, but I'm missing it.

    Geof

    Thursday, February 22, 2007 5:47 AM

Answers

  • You can still use two columns. The first is the type, the second is type+series:

     

         Type                            TypeSeries

    737                                                            737-100

    737                                                            737-200

    747                                                            747-100

     

    This solution is basically an extension of your proposed one. Please let me know if this works for you.

     

    Thanks,

     

    Thursday, February 22, 2007 6:28 PM
    Answerer

All replies

  • You can still use two columns. The first is the type, the second is type+series:

     

         Type                            TypeSeries

    737                                                            737-100

    737                                                            737-200

    747                                                            747-100

     

    This solution is basically an extension of your proposed one. Please let me know if this works for you.

     

    Thanks,

     

    Thursday, February 22, 2007 6:28 PM
    Answerer
  • Of course!  Thanks, I thought I was close.

    It worked just fine.

    Geof

    Saturday, February 24, 2007 5:24 AM