locked
Update Query doesn't complete RRS feed

  • Question

  • Any help is appreciated. I have put in comments as to what indexes I have. I am on SQL server 2005.

     

    With

     

    cte as

     

    (select distinct p.PID id ,p.HouseHoldID, Q.[Living Unit ID] , q.[Company Policy Number] pn

     

    from DM_Install_HH_History.dbo.vw_All_HH_Persons p -- has approx. 329 mill

     

    join DM_Install_HH_History.dbo.vw_All_Household_Details h -- has approx. 179 mill

     

    on p.hhdetails_id = h.id -- h.id is PK and h.livingUnitId has index, hhdetails_id is the 1st col in a composite index

     

    join CFO_EXP_Matches Q -- has approx. 13 mill. Unique clustered index on [Company Policy Number], non-unique, non-clustered index on [Living Unit ID]

     

    on Q.[Living Unit ID] = h.livingUnitId and Q.[Last Name Counter] = p.HouseHoldID

     

    where soundex(p.firstname)= soundex(Q.[First Name])

     

    and Q.matchflag is NULL

     

    )

    update

     

    CFO_EXP_Matches

    set

     

    HH_Persons_ID = cte.ID , matchFlag = 'FS'

    from

     

    CFO_EXP_Matches Q

    join

     

    cte on cte.[Living Unit ID] = Q.[Living Unit ID] and Q.[Last Name Counter] = cte.HouseHoldID

    and

     

    q.[Company Policy Number] = cte.pn

    Where

     

    Q.matchflag is NULL

    ;

    Wednesday, November 3, 2010 3:18 PM

Answers

  • Very difficult to give some advice without information about table / view schemas, including indexes, cardinality of the relationships, and also the execution plan.

    Here are a couple of advices.

    > JOIN
    > CFO_EXP_Matches Q
    > ON Q.[Living Unit ID] = H.livingUnitId
    > AND Q.[Last Name Counter] = P.HouseHoldID

    Here you need index on H(livingUnitId), P(HouseHoldID), and Q([Living Unit ID], [Last Name Counter]). See if having a composite index that includes also [Company Policy Number] helps.

    > OUNDEX(P.firstname) = SOUNDEX(Q.[First Name])

    This is definitely a performance killer. You could have a persisted computed column in each table and an index by this column too.

     


    AMB

    Some guidelines for posting questions...

    • Proposed as answer by Naomi N Wednesday, November 3, 2010 4:04 PM
    • Marked as answer by Kalman Toth Tuesday, November 9, 2010 8:59 PM
    Wednesday, November 3, 2010 3:41 PM

All replies

  • Can you do the query without using CTE?
    Wednesday, November 3, 2010 3:31 PM
  • Very difficult to give some advice without information about table / view schemas, including indexes, cardinality of the relationships, and also the execution plan.

    Here are a couple of advices.

    > JOIN
    > CFO_EXP_Matches Q
    > ON Q.[Living Unit ID] = H.livingUnitId
    > AND Q.[Last Name Counter] = P.HouseHoldID

    Here you need index on H(livingUnitId), P(HouseHoldID), and Q([Living Unit ID], [Last Name Counter]). See if having a composite index that includes also [Company Policy Number] helps.

    > OUNDEX(P.firstname) = SOUNDEX(Q.[First Name])

    This is definitely a performance killer. You could have a persisted computed column in each table and an index by this column too.

     


    AMB

    Some guidelines for posting questions...

    • Proposed as answer by Naomi N Wednesday, November 3, 2010 4:04 PM
    • Marked as answer by Kalman Toth Tuesday, November 9, 2010 8:59 PM
    Wednesday, November 3, 2010 3:41 PM
  • Thank you Hunchback.

    I will try the composite index. I already have an index on both the LivingUnitId columns. but not composite to include the other column. How do I attach an execution plan file? Can I generate the plan as a text format, then I can copy and paste it.

    The views are simple straightforward union between 2 tables. The relationship is one to many between vw_All_Household_Details and vw_All_HH_Persons . The CFO_EXP_Matches  is just an input dataset which keeps changing everytime a different dataset needs to be analyzed.

    I don't know what a persisted computed column is. I will read up on it. But will it be worth it if the input dataset keeps changing everytime I run these queries. This update statement is part of a sequence of 9 statements. Each statement uses different where condition where the soundex function is used. In fact I use, soundex and difference functions. Would this persisted computed column  be applicable in both cases? On which side of the equation do I need to create this?

     

    Wednesday, November 3, 2010 4:07 PM
  • Just wanted to post that the datatype in the input dataset wasn't correct. I was joining a varchar to bigint. This wasn't a big help with th ehuge number of rows being processes even with the indexes. once this was corrected, it wroked fine.

    Thanks,

    • Proposed as answer by Naomi N Wednesday, November 3, 2010 4:26 PM
    Wednesday, November 3, 2010 4:23 PM