locked
Update statement using function RRS feed

  • Question

  • Environment: Win7 and SQL server 2008 R2
    Tools: SQL management tool 2008 R2

    Problem: I have been trying to update id numbers in the staging table. T-SQL statement updates all id number in the staging table, but what if the we have multiple records for the same person who we are assigning the ID to his/her record. It should generate one pn_id per ssn regadless if there are several records for the same individual. for example, my code assigns the following id: 

    Student table (ID, ssn, name, subject, grade, year)
    (288258466,192160792 , Sally Johnson, Math, A, 2014 )
    (176268917, 192160792, Sally Johnson, Spanish, B+, 2014 )
    (900303787, 770616858, Jessica Simpson, Spanish, B, 2014 )
    (704099635, 770616858, Jessica Simpson, IT, A, 2014 )
    (704099640, 444718562, Douglas Mike, IT, A, 2014 )
    (288258450, 31883459, Susan Jesson, IT, A, 2014 )
    (176268960, 260518681, Veronica Floris, IT, A, 2014 )

    The expected results should be as the following in the staging table
    Student table (ID, ssn, name, subject, grade, year)
    (288258466,192160792 , Sally Johnson, Math, A, 2014 )
    (288258466, 192160792, Sally Johnson, Spanish, B+, 2014 )
    (900303787, 770616858, Jessica Simpson, Spanish, B, 2014 )
    (900303787, 770616858, Jessica Simpson, IT, A, 2014 )
    (704099640, 444718562, Douglas Mike, IT, A, 2014 )
    (288258450, 31883459, Susan Jesson, IT, A, 2014 )
    (176268960, 260518681, Veronica Floris, IT, A, 2014 )

    My code:

    UPDATE a
    SET pn_id =  (Select dbo.gen_id())
    from  [dbo].[Staging_Table] a 
    where SSN in (Select SSN from [dbo].[staging_Table]
    group by SSN having count(SSN) > 1)
    GO

    I also tried the following code but no success

    ;with cte
    As
    (
    Select	* , ROW_NUMBER() Over(Partition BY ssn Order by ssn)  As MyCount
    from dbo.Staging_Table
    )
    
    Select * into #a from cte where MyCount=1
    
    
    UPDATE a
    SET pn_id =  (Select dbo.fn_gen_id())
    from  staging_table a 
    Full  join #a b on a.ssn = b.ssn
    
    Drop table #a 
    please help.


    • Edited by Kalman Toth Thursday, March 20, 2014 10:11 PM spelling
    Thursday, March 20, 2014 12:31 AM

Answers

  • if its already generating the id values you just need to do this to make id value same for all records with same ssn

    UPDATE t
    SET ID = MaxID
    FROM (SELECT MAX(ID) OVER (PARTITION BY ssn) AS MaxID,ID
    FROM TableName
    )t
    WHERE MaxID <> ID


    Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

    • Marked as answer by Sandra VO Thursday, March 20, 2014 4:00 PM
    Thursday, March 20, 2014 3:53 PM
  • Try as below:

    ;with cte
    As
    (
    Select	* , ROW_NUMBER() Over(Partition BY ssn Order by pn_id desc)  As MyCount
    from dbo.Staging_Table
    )update A set  A.pn_id = B.pn_id
    From cte A
    Inner join cte B on A.ssn =B.ssn and A.pn_id<>B.pn_id and a.MyCount>1

    • Marked as answer by Sandra VO Thursday, March 20, 2014 3:58 PM
    Thursday, March 20, 2014 12:59 AM

All replies

  • ;With cteUniqueSSN As
    (Select Distinct SSN
    From dbo.staging_Table),
    cteAssignID As
    (Select c.SSN, dbo.gen_id() As NewAssignedID
    From cteUniqueSSN c}
    Update a
    Set pn_id = c.NewAssignedID
    From staging_Table a
    Inner Join cteAssignID c On a.SSN = c.SSN;

    The idea is to call the function only once for each unique SSN, then apply that result to all rows with that SSN.

    Tom

    • Proposed as answer by Naomi N Thursday, March 20, 2014 2:40 AM
    Thursday, March 20, 2014 12:54 AM
  • Try as below:

    ;with cte
    As
    (
    Select	* , ROW_NUMBER() Over(Partition BY ssn Order by pn_id desc)  As MyCount
    from dbo.Staging_Table
    )update A set  A.pn_id = B.pn_id
    From cte A
    Inner join cte B on A.ssn =B.ssn and A.pn_id<>B.pn_id and a.MyCount>1

    • Marked as answer by Sandra VO Thursday, March 20, 2014 3:58 PM
    Thursday, March 20, 2014 12:59 AM
  • update z
    set pn_id = b.NewId
    from [dbo].[Staging_Table] z, 
    (select a.SSN,dbo.gen_id() NewId from
    (select distinct SSN from [dbo].[Staging_Table]) a) b

    where z.SSN = b.SSN

    Thursday, March 20, 2014 1:10 AM
  • Hi Tom: Thank you so much for the help. I tried your code technique and I didn't work :/
    It did generate two unique pn_ids for the same person

    367623613    210875011
    461348220    210875011

    Thursday, March 20, 2014 3:25 PM
  • Hi  Powershell Beginner - Thanks but the code didn't solve the issue
    Thursday, March 20, 2014 3:51 PM
  • if its already generating the id values you just need to do this to make id value same for all records with same ssn

    UPDATE t
    SET ID = MaxID
    FROM (SELECT MAX(ID) OVER (PARTITION BY ssn) AS MaxID,ID
    FROM TableName
    )t
    WHERE MaxID <> ID


    Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

    • Marked as answer by Sandra VO Thursday, March 20, 2014 4:00 PM
    Thursday, March 20, 2014 3:53 PM
  • Change that code from UPDATE into SELECT and verify the output first.

    I don't see the problem in that logic, but it always better to first run select statement instead of the actual update.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, March 20, 2014 3:54 PM
  • Hi Naomi - are you referring tothe following code segment where the update has been changed to Select

    With cteUniqueSSN As
    (Select Distinct SSN
    From dbo.staging_Table),
    cteAssignID As
    (Select c.SSN, dbo.gen_id() As NewAssignedID
    From cteUniqueSSN c}
    
    SELECT a
    Set pn_id = c.NewAssignedID
    From staging_Table a
    Inner Join cteAssignID c On a.SSN = c.SSN;
    Right? I am just little bid confused on how select statement can be replaced update statement. will you be kind to show me


    • Edited by Sandra VO Thursday, March 20, 2014 4:40 PM updated my post
    Thursday, March 20, 2014 4:38 PM
  • Right, exactly:

    With cteUniqueSSN As
    (Select Distinct SSN
    From dbo.staging_Table),
    cteAssignID As
    (Select c.SSN, dbo.gen_id() As NewAssignedID
    From cteUniqueSSN c}
    
    SELECT a.*, c.NewAssignedID
    
    From staging_Table a
    Inner Join cteAssignID c On a.SSN = c.SSN
    ORDER BY a.SSN;


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, March 20, 2014 4:59 PM
  • Hi Naomi - I just tested out the code for you and, unfortunately, it did not return the expected results. Let me know if you want to try something out.
    Thursday, March 20, 2014 5:13 PM
  • Hmm, strange - are you saying it returned different new IDs for the same people (with the same SSN)?

    I do not see how this is possible - the code seems very simple and logical to me.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, March 20, 2014 5:15 PM
  • I ran some tests and Sandra is correct, you don't always get the same value for all occurrences of the same SSN.

    After thinking about this, I think I understand why this happens.

    With functions, SQL is allowed to assume that multiple calls to a function in a single command with the same parameter values always return the same result.  That's why functions cannot use side-effecting functions and is not allowed to make changes to any table except table variables declared in the function.

    And SQL is allowed to rearrange the processing of a command in any manner it believes is most efficient as long as it returns the same result.  So even though the query I gave says first get a set of distinct SSN's and then calls the function only once for each SSN, it is valid for SQL to instead call the function once for each row in the original table since this function has no parameters and is therefore should return the same value on each call.  But since the function being called apparently (we've not seen the code of the function) returns different values on different calls, the same SSN could be assigned different values, but SQL would still be working correctly.

    Note that it would also be valid for SQL to notice that this function has no parameters and therefore should return the same value every time it is called within the same command and so just call the function once per execution of the command.  This would mean that every SSN would get the same ID.  As far as I can determine currently SQL does not do that but a different version in the future with a different optimizer might.

    Tom

    Thursday, March 20, 2014 6:47 PM
  • I was thinking that we may need to run the result of cte into a temp table first. This way we will guarantee (?) the desired result.

    Can you try that idea as well?


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, March 20, 2014 6:59 PM
  • Hi Naomi- yes, it returned different new IDs for the same people (with the same SSN).

    401329254    209875009
    171807211    209875009

    • Edited by Sandra VO Thursday, March 20, 2014 8:40 PM
    Thursday, March 20, 2014 8:35 PM
  • Check Tom's explanation of this phenomena. I was suspecting something like this, I'm curious if you can run the first two CTEs into temp table and then use that temp table and verify the results.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, March 20, 2014 8:44 PM