locked
Generating delete and update dynamically RRS feed

  • Question

  • Hi everybody,

    I'm working of translating procedural Visual FoxPro code into a stored procedure. The procedure is used to de-dup Guest data.

    Here is the problem portion of the code I'm struggling to figure out a solution:

    else -- primary key case
    	   
    	   set @sql = @sql + '
    	   IF EXISTS (SELECT 1 FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo)
    	   
    	       BEGIN
    		      IF NOT EXISTS (SELECT 1 FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @newGuestNo) -- simple case, only duplicate guest exists
    			     UPDATE dbo.' + quotename(@table_name) + ' SET guest_no = @newGuestNo 
    				 WHERE guest_no = @originalGuestNo;
                  ELSE -- both guests exist in the table 
    			    IF (SELECT ISNULL(date_time, ''19000101'') FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @newGuestNo) >=
    				   (SELECT ISNULL(date_time, ''19000101'') FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo)
    				      -- New guest is newer - just delete the duplicate
    					  DELETE FROM FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo;
                    ELSE -- duplicate is newer
    				     
    		   END'
    	      
    
    	 end

    So, I first obtain a list of tables that contain guest_no or parent_no columns. Then I process these tables in a loop. If guest_no is not a primary key in that table, then the code is simple - just update original guest number with the new guest number.

    If the guest_no is a primary key of the table, then the following logic applies:

    1. We only have new guest number and don't have duplicate (original guest number) - this is the simplest case when we don't need to do anything

    2. We have only original guest no and don't have new guest no - in this case we just need to update

    3. We have both of them and the new guest information is newer (the date_time column is more recent) - just delete the original

    4. This is the case I am struggling with - when the original guest info is newer and has to be moved over to the new guest. In this case we would want to update the original guest_no with the new guest_no and delete the current new guest no. The guest_no is a primary key in the table so I can not do that.

    Since I'm writing the code dynamically I don't see a simple way out of my problem. Do you have ideas?

    Thanks in advance.

     


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles



    • Edited by Naomi N Monday, January 4, 2016 7:08 PM
    Monday, January 4, 2016 7:02 PM

Answers

  • Hi Naomi N,

    Since there are no foreign key concerns(could disable first), for the last ELSE statement block in your script, it seems that you may delete the new guest_no first, then update the original guest_no to avoid primary key conflict, use OUTPUT Clause if necessary. Meanwhile, @updateCmd could be removed.

    If you have any feedback on our support, you can click here.


    Sam Zha
    TechNet Community Support

    Wednesday, January 6, 2016 5:39 AM

All replies

  • Why can't you update the column, when it's part of the primary key?

    E.g.

    DECLARE @new INT = 20 
    	, @original INT = 10;
    
    DECLARE @table TABLE
    	(
    		ID INT PRIMARY KEY,
    		Payload INT
    	);
    
    INSERT INTO @table
    VALUES	( 10, '10' );
    
    UPDATE	@table
    SET		ID = @new
    WHERE	ID = @original;
    
    SELECT	*
    FROM	@table;


    Monday, January 4, 2016 8:05 PM
  • Because I have both keys in the table - new and original. E.g.

    DECLARE @new INT = 20 
    	, @original INT = 10;
    
    DECLARE @table TABLE
    	(
    		ID INT PRIMARY KEY,
    		Payload INT
    	);
    
    INSERT INTO @table
    VALUES	( 10, 10 ), (20, 15);
    
    UPDATE	@table
    SET		ID = @new
    WHERE	ID = @original;
    
    SELECT	*
    FROM	@table;

    In any case, I figured the solution already. Here is my current procedure although I still need some fixes:

    while @loop <=@tablesCount
    begin
         
      select @table_name = table_name from @TablesToProcess where id = @loop;
      
    
      if upper(@table_name) not IN ('GUESTS','ACCESS','ADDRESS','ADDLINK','DAILY_SALE_HDR','DAILY_TRANSACT')
        begin
    
    	 set @sql = @sql + 'RAISERROR (''Processing table ' + quotename(@table_name) + ' iteration #' + cast(@loop as varchar(10)) + ' out of ' + 
    	 cast(@tablesCount as varchar(10)) + '..''  , 0, 1) with nowait;
    	 '
    
        IF not exists(SELECT 1 FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS TC
    
    INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE CU ON TC.TABLE_NAME = CU.TABLE_NAME
    
    AND TC.TABLE_SCHEMA = CU.TABLE_SCHEMA
    
    AND TC.CONSTRAINT_NAME = CU.CONSTRAINT_NAME
    
    WHERE TC.CONSTRAINT_TYPE ='PRIMARY KEY'  AND TC.TABLE_NAME = @table_Name
    AND CU.COLUMN_NAME = 'guest_no') -- exclude cases where guest_no is a PK for the table as this is more complex case
    
         begin
    
    	 if exists (select 1 from INFORMATION_SCHEMA.COLUMNS c WHERE c.Table_Name = @table_name and c.COLUMN_NAME = 'guest_no')	 
    	  SET @sql = @sql + ' UPDATE dbo.' + quotename(@table_name) + ' set guest_no = @newGuestNo ' + 
    	  case upper(@table_name) WHEN 'GST_PASS' then ', mod_sp = @salespoint, mod_op = @operator' 
    	                          WHEN 'GST_ACTV' then ', gfwdstatus = 1'
     	  else '' end + 
    	  
    	  '
    	  where guest_no = @originalGuestNo;
    	  '
    	 if exists (select 1 from INFORMATION_SCHEMA.COLUMNS c WHERE c.Table_Name = @table_name and c.COLUMN_NAME = 'parent_no')	 
    	     SET @sql = @sql + ' UPDATE dbo.' + quotename(@table_name) + ' set parent_no = @newGuestNo where parent_no = @originalGuestNo;
    		  '
    
          if upper(@table_name) IN ('TR_INFO', 'TRS_INFO')
    
          set @sql = @sql + ' UPDATE dbo.' + quotename(@table_name) +  ' set info_num = CAST(@newGuestNo as CHAR(17)) 
    	  where info_num = CAST(@originalGuestNo as CHAR(17)) AND info_type = 10
    	  ;'
    
    
    	 end
    	 else -- primary key, more complex case
    	 begin
    	     select @updateCmd = 'WITH cte AS (select * FROM dbo.' + QUOTENAME(@table_name) + ' WHERE guest_no = @originalGuestNo)
    	   UPDATE t  	   
    	   SET ' + stuff((select ', ' + quotename(c.COLUMN_NAME) + ' = cte.' + quotename(c.COLUMN_NAME)
    	   FROM INFORMATION_SCHEMA.COLUMNS c WHERE c.TABLE_NAME = @table_name 
    	   AND c.COLUMN_NAME NOT IN ('guest_no', 'last_mod') FOR XML PATH('')), 1, 2, '')  + '
    	   FROM dbo.' + QUOTENAME(@table_name) + ' t, cte
    	   WHERE t.guest_no = @newGuestNo;
    	   '
    	   print @updateCmd;
    
    	   set @sql = @sql + '
    	   IF EXISTS (SELECT 1 FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo)
    	   
    	       BEGIN
    		      IF NOT EXISTS (SELECT 1 FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @newGuestNo) -- simple case, only duplicate guest exists
    			     UPDATE dbo.' + quotename(@table_name) + ' SET guest_no = @newGuestNo 
    				 WHERE guest_no = @originalGuestNo;
                  ELSE -- both guests exist in the table 
    			    IF (SELECT ISNULL(date_time, ''19000101'') FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @newGuestNo) >=
    				   (SELECT ISNULL(date_time, ''19000101'') FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo)
    				      -- New guest is newer - just delete the duplicate
    					  DELETE FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo;
                    ELSE -- duplicate is newer
    				     ' + @updateCmd + 
    					 '
    					  DELETE FROM dbo.' + quotename(@table_name) + ' WHERE guest_no = @originalGuestNo;
    		   END
    		   '      
    
    
    	 end
    	
    	 
    	 end
    
    	 set @loop = @loop + 1;
    end
    
    set @sql = 'BEGIN TRANSACTION 
    ' + @sql + '
    COMMIT TRANSACTION'
    
    if @debug = 1
       select @SQL as SQL;
    
    --execute sp_executeSQL @SQL, N'@newGuestNo decimal(17,0), @originalGuestNo decimal(17,0), @salespoint CHAR(6), @operator CHAR(6)', 
    -- @newGuestNo = @nNewGuest, @originalGuestNo = @nOriginalGuest, @salespoint = @salespoint, @operator = @operator;
    
    end
    
    go
    
    execute dbo.siriussp_DedupeGuest @nOriginalGuest = 90000001, @nNewGuest = 91000001, @salespoint = 'RENTAL', @operator = 'ADMIN';

    In other words, I just generate the UPDATE command for this case dynamically and then delete the original.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles


    • Edited by Naomi N Monday, January 4, 2016 8:51 PM
    Monday, January 4, 2016 8:17 PM
  • When I see this, I can't escape from asking: why are there multiple tables in the first place? In with different PKs?

    You may understand that code today, but will you or someone else understand it three years later? Just think how simpler this code would be if it was a single table!

    Monday, January 4, 2016 11:10 PM
  • There are many tables in the database that have guest_no as a foreign key. If we're de-duping guests (e.g. merging data from one guest to another), we would need to switch all related data to the guest that we're going to keep. So, for many tables where guest_no is the foreign key we just need to run the UPDATE statement.

    However, there are few tables where guest_no is a primary key (just a few besides guests table)  and also there are some tables where guest_no put into another column as a reference. These tables are special cases.

    There is a complex procedural code that does the de-duping (one table at a time - if something goes wrong we may end up with partial data deletions).

    I'm trying to move that logic into a stored procedure. I'm still not done, but I've decided to generate one script for all changes and run it in transaction. The clients should be aware to run this process in off-time. 


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, January 4, 2016 11:39 PM
  • i dont agree with you 

    because in updating + deleting time have a great prob in dynamically query every time 

    i think you query need with stored proc and without dinamycally 

    and 

    we can't said about your scripts because dont was see your table struture and in finally what you need

    please add hier model your tables or simple fragment and write clearly what you need 


    Tuesday, January 5, 2016 6:50 AM
  • I tried to explain the business need. Basically, I want to de-dupe the guests table. Guests table has a primary key guest_no and this key is used as a foreign key in many tables in the system. Also, the guest_no column may be called parent_no (when it's a parent guest - say, parent with children) and also in a few cases the guest_no column can be called differently and even saved with different type (say, it's decimal(17,0) in the system but in a few tables it is saved as char(17)) and in a few it's saved in XML like structure (in text type).

    So, when we merge two guests into 1, we need to first move all related tables info to the guest we're going to keep - this is in tables where guest_no (or parent_no) is a foreign key. This is relatively simple case as we only need to run update command.

    Also, as I mentioned, in a few tables the guest_no is a primary key (being at the same time a foreign key to guests table). These cases are complicated as we can not just run an update. In these few cases the logic I posted in my very first message of this thread applies.

    In any case, I already coded all that logic and now just need to add some extra complex things such as logging the changes in the guest activity table and similar things.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Tuesday, January 5, 2016 3:24 PM
  • The only thing I can see wrong, Naomi, is the hardcoded table list:

    upper(@table_name) not IN ('GUESTS','ACCESS','ADDRESS','ADDLINK','DAILY_SALE_HDR','DAILY_TRANSACT')

    and a severe lack of comments :)

    I'm thinking the table list is probably the tables which have the guest_no as a PK? Would it not be better to derive this from the sys tables instead? I know it's not likely that a new table with guest_no as a PK will appear, but I'd want to be ready for it.


    Don't forget to mark helpful posts, and answers. It helps others to find relevant posts to the same question. Enjoyed my post? Hit the up arrow (left)
    Really enjoyed it? See my profile!
    My Tech Net Articles.

    Tuesday, January 5, 2016 3:31 PM
  • No, these particular tables need some special consideration that's why they are excluded from the normal processing logic.

    Guests table is the one with PK and needs to be processed last. Daily sale and the other one are not important as they are re-created by the reporting, so we don't have to de-dupe those.

    Address tables need complex logic. Actually, in our new system these tables are being phased out and there are couple of different tables in place of them. I need to figure out logic for these new tables as well.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Tuesday, January 5, 2016 3:36 PM
  • Is this guest_no number a surrogate or natural key?

    When it's a surrogate key, then I would simply INSERT a new (consolidated) row in each table. Then you never need an UPDATE. Only a DELETE of the old rows.

    When they are natural keys, I would do the same, but DELETE the old rows and UPDATE the new row to the correct number in the end.

    Tuesday, January 5, 2016 3:50 PM
  • How would you insert that consolidated row?

    Do I understand you correctly that you suggest to start from guests, generate a new key, then update all related tables old guest numbers (both of them) with that new number? How would you suggest to deal with a few tables where the guest no is a PK but at the same time an FK to guests table? 

    I think your proposal will increase number of changes that will need to be applied.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Tuesday, January 5, 2016 3:59 PM
  • "Do I understand you correctly that you suggest to start from guests, generate a new key, then update all related tables old guest numbers (both of them) with that new number? "

    Yup.

    "How would you suggest to deal with a few tables where the guest no is a PK but at the same time an FK to guests table?" 

    Create a new consolidated row with the new ID. Then updated the depended tables to the new ID. 

    "I think your proposal will increase number of changes that will need to be applied."

    Yes, this is true, but imho the logic of the procedure gets simpler.

    And as Don Knuth said: Make it run, make it right, make it fast. (premature optimization is the root of all evil).

    Tuesday, January 5, 2016 4:49 PM
  • I think the logic is not going to be simpler in this case. In any case, thanks for the suggestion.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Tuesday, January 5, 2016 4:52 PM
  • There are many tables in the database that have guest_no as a foreign key. If we're de-duping guests (e.g. merging data from one guest to another), we would need to switch all related data to the guest that we're going to keep. So, for many tables where guest_no is the foreign key we just need to run the UPDATE statement.

    OK, I see. Cascading foreign keys would reduce the coding, but when not all columns do not have the same data type, you need to clean up that first. And I would be suprised if you were able to get away with it all the way, as there are so many limitations when you can use cascading keys.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Tuesday, January 5, 2016 10:45 PM
  • In our system we're not using cascading foreign keys. I think in most cases we're even using NOCHECK for constraints as we're using 0 instead of NULL when we don't want to have foreign key reference. This is just historical issue in our database and yes, it is an invitation for problems. But that's just how it is and not supposed to change any time soon.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Tuesday, January 5, 2016 11:23 PM
  • Hi Naomi N,

    Since there are no foreign key concerns(could disable first), for the last ELSE statement block in your script, it seems that you may delete the new guest_no first, then update the original guest_no to avoid primary key conflict, use OUTPUT Clause if necessary. Meanwhile, @updateCmd could be removed.

    If you have any feedback on our support, you can click here.


    Sam Zha
    TechNet Community Support

    Wednesday, January 6, 2016 5:39 AM
  • I think in most cases we're even using NOCHECK for constraints as we're using 0 instead of NULL when we don't want to have foreign key reference.

    And you confess that in public? You are certainly a brave person. :-)

    Wednesday, January 6, 2016 11:26 AM
  • Hi Sam,

    This sounds like a good alternative, I'll adjust the code as you suggest. 


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Wednesday, January 6, 2016 12:48 PM