none
UPDATE table with values from another table

    Question

  • I have this one big table containing 26 mil records that I have to update as follows:

     

    1.       Table 1 (tblA) – 26 mil records

    a.       Column LNKana – stores last names as nvarchar in Japanese Katakana

    b.       Column FNKana – stores first names as nvarchar in Japanese Katakana

    c.       Column FName – needs to be updated with English equivalents of first names in Katakana

    d.       Column LName – needs to be updated with English equivalents of last

    names in Katakana

     

    Two nonclustered indexes: ix_LNKana si ix_FNKana

     

    2.       Table 2 (tblB) – contains 60,000 records

    a.       Column FNKana – stores first names as nvarchar in Jpanaese Katakana (unique values only)

    b.       Column FName – stores the English first name equivalents (non unique)

     

    One index: ix_FNKana – clustered

     

    3.       Table 3 (tblC) – contains 69,000 records

    a.       Column LNKana – stores last names as nvarchar in Jpanaese Katakana (unique values only)

    b.       Column LName – stores the English last name equivalents (non unique)

     

    One index: ix_LNKana – clustered

     

     

    The task here is to update columns FName and LName in Table 1 with FName and LName values from Table 2 and Table 3 respectively.

     

    I wrote these two update statements:

               

                1. UPDATE tblA SET tblA.LName =

                            (SELECT tblC.LName FROM tblC WHERE tblC.LNKana = tblA.LNKana)

     

                WHERE EXISTS

                            (SELECT tblC.LName FROM tblC WHERE tblC.LNKana = tblA.LNKana)

     

    2. UPDATE tblA SET tblA.FName =

                            (SELECT tblB.FName FROM tblB WHERE tblB.FNKana = tblA.FNKana)

     

                WHERE EXISTS

                            (SELECT tblB.FName FROM tblB WHERE tblB.FNKana = tblA.FNKana)

                           

    Statement #1 worked just fine and finished the update in 25 min.  The execution plan showed the usage of indexes in both tables.

     

    However, statement #2 seams to take too long (> 7 hours and still running).  The execution plan shows that indexes are used, but when I run it, it takes for ever.  The activity log shows that the update process is in “sleep” mode.

     

    Is there anything I can do to fix this?  Is there anything wrong with the second update statement?  I don’t have any explanation of why the first update worked just fine and this second one has died.  I tried to stop it and now it shows that a rollback is in progress.

     

    I would really appreciate any suggestion. 

    Sunday, August 06, 2006 8:57 PM

Answers

  • try to use inner join instead

    update    a
        set    LName    = c.LName
    from    tblA a inner join tblC c
    on    a.LNKana    = c.LNKana

    update    a
        set    FName    = b.FName
    from    tblA a inner join tblB b
    on    a.LNKana    = b.LNKana

    Monday, August 07, 2006 4:28 AM
  • My guess is that you are running into hardware locks with the logging of your statement.  ~ how many rows will be eventually affected.  If it is a really large amount, then this could easily be the issue.  This is especially true if you have logs, indexes, tables, etc on a single array.

    Have you run perfmon to see what kind of pressure is being applied to the different subsystems?  It could be that you are beating the heck out of memory, disk, cpu, or something eles.

    Try applying a range to your query in some manner to just update a few rows and keep incrementing until you come to the hardware sweet spot where it has to start paging.  It might be that 100000 rows goes really quickly, or it also might be that I am wrong completely :)

    Wednesday, August 09, 2006 12:57 AM
  • You can try to create a temp table with the exact structure as tblA, with no index or constraint on it. Then insert your resultset into this temp table by using DTS data load task. It may be faster. Finally, you recreate the indexes on this temp table and then use it.

    Sample:

    insert into temp table

    Select A.LNKana, A.FNKana, B.FName, C.LName

    from tblA A with (nolock)

    left outer join tblB B with (nolock)

    on A.FNKana = B.FNKana

    left outer join tblC C with (nolock)

    on A.LNKana = C.LNKana

     

     

    Wednesday, August 09, 2006 2:46 AM

All replies

  • try to use inner join instead

    update    a
        set    LName    = c.LName
    from    tblA a inner join tblC c
    on    a.LNKana    = c.LNKana

    update    a
        set    FName    = b.FName
    from    tblA a inner join tblB b
    on    a.LNKana    = b.LNKana

    Monday, August 07, 2006 4:28 AM
  • What is the Wait Type of your process?

    Is there any other process Blocking it?

    Monday, August 07, 2006 5:37 PM
  • I changed it to the above.  It run into the same problem.  It is waiting in sleep mode.
    Monday, August 07, 2006 10:47 PM
  • it says PAGEIOLATCH_SH and I have a resource code 20:1:2444203
    Monday, August 07, 2006 10:48 PM
  • My guess is that you are running into hardware locks with the logging of your statement.  ~ how many rows will be eventually affected.  If it is a really large amount, then this could easily be the issue.  This is especially true if you have logs, indexes, tables, etc on a single array.

    Have you run perfmon to see what kind of pressure is being applied to the different subsystems?  It could be that you are beating the heck out of memory, disk, cpu, or something eles.

    Try applying a range to your query in some manner to just update a few rows and keep incrementing until you come to the hardware sweet spot where it has to start paging.  It might be that 100000 rows goes really quickly, or it also might be that I am wrong completely :)

    Wednesday, August 09, 2006 12:57 AM
  • Thanks very much for the suggestions.  All counters seams to indicate a problem with the disk system on the server.  Anyways, last night after reading some postings from other forums, I got at the same conclusion as you:  that I should process the table in smaller splits using SET ROWCOUNT and a WHERE ... IS NULL condition, something like that.  I hope it will work.

    Wednesday, August 09, 2006 2:36 AM
  • You can try to create a temp table with the exact structure as tblA, with no index or constraint on it. Then insert your resultset into this temp table by using DTS data load task. It may be faster. Finally, you recreate the indexes on this temp table and then use it.

    Sample:

    insert into temp table

    Select A.LNKana, A.FNKana, B.FName, C.LName

    from tblA A with (nolock)

    left outer join tblB B with (nolock)

    on A.FNKana = B.FNKana

    left outer join tblC C with (nolock)

    on A.LNKana = C.LNKana

     

     

    Wednesday, August 09, 2006 2:46 AM
  • I tested your solution too, and worked well.  However, it wasn’t fast enough.  Based on Louis's suggestions, I build a stored procedure using “SET ROWCOUNT 1000000” and added a “WHERE tblA.FName IS NULL” condition. I then put the EXECUTE in a WHILE loop and finished the task in less than an hour.

    Thanks all for their input.  It is greatly appreciated. 

    Thursday, August 10, 2006 2:12 AM
  • How can I update a table from info from another table?

    What I need to do is take the info from a field in Table A and insert that ifno in a field with the same name in Table B

    My tables are ProductVariant and the field is DisplayOrder

    I Need the DisplayOrder info from ProductVariant to be included in the DisplayOrder fields of the ProductCategory Table

    I have been using SET to update info when the data is is the same table, but cannot figure out how to do it when the data needs to come from a different table.

    Thanks

    Terry

    Monday, November 27, 2006 7:13 PM
  • I had a similar issue and your post helped me out, thanks!
    Thursday, July 28, 2011 4:14 PM