none
Market basket analysis

    Question

  • Hi!

     

    I have a problem. I would like to make Market basket analysis on data.

    Here is the example of data structure:

     

    Customers

    Customer_ID

    Gender

    Age

    City

     

    CustomerProducts

    Customer_ID

    Product_ID

     

    I've created mining association model where Customers is a case table and CustomerProducts is nested table.

    (I guess that association algorithm is the best for this kind of analysis).

     

    My questions are:

    1. How to write DMX query for next question:

       Give me the list of customers that are using Products R1, R2, ... Rn and have potential to buy Product Ri

       (List of customers with probabilities to buy Ri)

      

    2. DMX query for next:

       Same as 1. but customers filtered by some criteria, for example Gender = 'M' and Age between 25 and 36

      

    3. DMX query for next

        For chosen product Ri give me the list of groups of products with probabilities that customers that are using products from that group will buy product Ri.

    (here is the example: for product Ri list will have to look like this:

                  R2, R5, R6            0.85

                  R5, R7                  0.76

                  R1, R4, R5, R8     0.75

                  ...

       )

    I don't know does this report have sense since customers that own products from one group have different other attribute values (Gender: M, F; Age...) but I have request like that...

                 

    Thanks in advance

     

    Tuesday, October 16, 2007 7:47 AM

Answers

  • Hi Cristiano74!

     

    Here is my idea for solution for 1. and 2. question:

     

    SELECT /* FLATTENED */
      t.[Customer_ID],
      t.[Gender],
      t.[Age],
       PredictAssociation([Model].[CustomerProducts], 1, INCLUDE_STATISTICS)
    From
      [Model]
    PREDICTION JOIN
      SHAPE {
      OPENQUERY([DataSource],
        'SELECT
          [Customer_Id],
          [Gender],
          [Age]
          FROM
          [Customers]
        WHERE /* add where clause for filtering just customers that have ordered products R1, ..., Rn */
        AND Gender >= 25 AND Gender <= 36  

        ORDER BY
          [Customer_Id]')}
      APPEND
      ({OPENQUERY([DataSource],
        'SELECT
          [Product_ID],
          [Customer_Id]
        FROM
          [dbo].[CustomerProducts]
        ORDER BY
          [Customer_Id]')}
        RELATE
          [Customer_Id] TO [Customer_Id])
        AS
          [CustomerProducts] AS t
    ON
      [Model].[Gender] = t.[Gender] AND
      [Model].[Age] = t.[Age] AND
      [Model].[CustomerProducts].[Product_ID] = t.[CustomerProducts].[Product_ID]

     

     

     

    Monday, October 29, 2007 9:14 AM

All replies

  • Hi MO5,
     I'm a newbie of DM but if you send me a sample database of your data, I'll try to solve just your question1.

    Cristiano
    Sunday, October 21, 2007 10:14 AM
  • Hi Christiano74!

     

    Sorry, I didn't answer you earlier, but I think I solved my problem (found the solution).

     

    Thanks anyway!

     

    Friday, October 26, 2007 7:52 AM
  • Hi, maybe your solution is interesting for me and the forum.
    Could you post the highline of your solution?

    Thanks
    Friday, October 26, 2007 8:09 AM
  • Hi Cristiano74!

     

    Here is my idea for solution for 1. and 2. question:

     

    SELECT /* FLATTENED */
      t.[Customer_ID],
      t.[Gender],
      t.[Age],
       PredictAssociation([Model].[CustomerProducts], 1, INCLUDE_STATISTICS)
    From
      [Model]
    PREDICTION JOIN
      SHAPE {
      OPENQUERY([DataSource],
        'SELECT
          [Customer_Id],
          [Gender],
          [Age]
          FROM
          [Customers]
        WHERE /* add where clause for filtering just customers that have ordered products R1, ..., Rn */
        AND Gender >= 25 AND Gender <= 36  

        ORDER BY
          [Customer_Id]')}
      APPEND
      ({OPENQUERY([DataSource],
        'SELECT
          [Product_ID],
          [Customer_Id]
        FROM
          [dbo].[CustomerProducts]
        ORDER BY
          [Customer_Id]')}
        RELATE
          [Customer_Id] TO [Customer_Id])
        AS
          [CustomerProducts] AS t
    ON
      [Model].[Gender] = t.[Gender] AND
      [Model].[Age] = t.[Age] AND
      [Model].[CustomerProducts].[Product_ID] = t.[CustomerProducts].[Product_ID]

     

     

     

    Monday, October 29, 2007 9:14 AM
  • Thanks for your post!
    Thursday, November 01, 2007 8:31 PM