locked
update show less number of rows than select RRS feed

  • Question

  • Hello Everyone,

    I have two tables, one is EDW_Master_Eligibility table and other is EDW_Master_Onsite2012  table. I am joining two tables to get MemberID from EDW_Master_Eligibility table. 

    The query was 

      select distinct a.MemberID,s.MemberID 
      FROM [SDC].[EDW_Master_Onsite2012] s
      join [ETL].[EDW_Master_Eligibility] a 
      on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
        where   s.ClientID='10030' and a.ClientID='10030' 

    Message: 506 Rows Effected

    but when i am trying to update it ,using update command 

    update s
    set s.MemberID=a.MemberID
    FROM [SDC].[EDW_Master_Onsite2012] s
     join [ETL].[EDW_Master_Eligibility] a 
      on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
        where   s.ClientID='10030' and a.ClientID='10030'

    Message: 499 Rows Effected

    I was wondering where was the other 7 rows..  Any idea about the  7 missing rows 

    • Changed type Kalman Toth Friday, November 8, 2013 4:56 PM
    • Changed type Kalman Toth Friday, November 8, 2013 4:57 PM
    Wednesday, November 6, 2013 5:20 PM

Answers

  • Let me make this a little simpler for you.  What are the rows returned by this query:

    select top (10) s.MemberID, count(distinct a.MemberID) 
       FROM [SDC].[EDW_Master_Onsite2012] s
       join [ETL].[EDW_Master_Eligibility] a 
       on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
         where   s.ClientID='10030' and a.ClientID='10030' 
     group by s.MemberID
    order by count(distinct a.MemberID) desc

    What are the rows that are returned?

    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:31 PM
  • For each of those seven records, there are two possible MemberIDs from the EDW_Master_Eligibility table.  SQL Server chooses one of those randomly.

    The Merge statement on the other hand will not update if there are two candidate rows to update from.


    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:38 PM

All replies

  • My guess is that the number is changing because you're using a DISTINCT in your select, but selecting columns from two tables, whereas you're only updating a single table.

    Does the following query result in 499?

      select COUNT(s.MemberID)
      FROM [SDC].[EDW_Master_Onsite2012] s
      join [ETL].[EDW_Master_Eligibility] a 
      on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
        where   s.ClientID='10030' and a.ClientID='10030' 



    Zach Stagers
    MCITP: Database Developer 2008
    Remember to Mark as Answer and Vote as Helpful

    Wednesday, November 6, 2013 5:45 PM
  • My guess is that there are multiple records in EDW_Master_Eligibility for a single record in EDW_Master_Onsite2012.  What will happen is that SQL Server will randomly select one record to use in the update.

    Try the following.  I expect that you will get several s.MemberIDs that are associated with different a.MemberIDs.

    select s.MemberID, count(distinct a.MemberID) 
       FROM [SDC].[EDW_Master_Onsite2012] s
       join [ETL].[EDW_Master_Eligibility] a 
       on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
         where   s.ClientID='10030' and a.ClientID='10030' 
     group by s.MemberID
    order by count(distinct a.MemberID) desc


    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    • Proposed as answer by Gert-Jan Strik Wednesday, November 6, 2013 6:24 PM
    Wednesday, November 6, 2013 6:11 PM
  • Hello Zach,

    If i remove the distinct and execute the query the result doesn't changed.

    Message: 506 Rows Effected

    Wednesday, November 6, 2013 6:49 PM
  • What about this merge syntax for your query (SQL Server 2008 or above)

    ;WITH mycte AS
      (SELECT * FROM  [ETL].[EDW_Master_Eligibility]  WHERE ClientID='10030')
      Merge [SDC].[EDW_Master_Onsite2012] AS  s
      Using  mycte AS  a
      ON s.FirstName=a.First_Name AND s.LastName=a.Last_Name AND s.BirthDate=a.DOB AND s.ClientID=a.ClientID
      
      WHEN Matched THEN
      UPDATE SET MemberID=a.MemberID;

    Wednesday, November 6, 2013 7:11 PM
  • Hello Li,

    Thanks for your reply. But the code doesn't works it gives the same result.

    Wednesday, November 6, 2013 8:11 PM
  • And what results did you get with my query?

    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:13 PM
  • Hello Russ,

    It was the same result. 

    Message: 506 Rows Effected

    I don't know how to figure it out. 

    Wednesday, November 6, 2013 8:18 PM
  • But what were the first 7 records returned by my query?


    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:20 PM
  • Let me make this a little simpler for you.  What are the rows returned by this query:

    select top (10) s.MemberID, count(distinct a.MemberID) 
       FROM [SDC].[EDW_Master_Onsite2012] s
       join [ETL].[EDW_Master_Eligibility] a 
       on s.FirstName=a.First_Name and s.LastName=a.Last_Name and s.BirthDate=a.DOB 
         where   s.ClientID='10030' and a.ClientID='10030' 
     group by s.MemberID
    order by count(distinct a.MemberID) desc

    What are the rows that are returned?

    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:31 PM
  • Thanks for your reply. It shows that first 7 records are 2 i.e duplicates.
    Wednesday, November 6, 2013 8:35 PM
  • The problem is one of assumptions.  You assume that a row in MasterOnSite joins to  a (i.e, single) row in MasterEligibility.  This is obviously not the case.  Therefore, when you update one of the tables with values "pulled" from the other table, you are committing 1 logical error and assuming that your update statement will actually update the same number of rows as was selected.  It is obvious that at least one row in MasterOnsite joins to multiple rows in MasterEligibility.  Therefore, the number of rows in MasterOnsite updated will ALWAYS be less than the number of rows selected in your join.  Given this fact, we can now see your logic error.  In a situation I just described, the row that participlates in the 1:M join will only be updated once - but it is impossible to know which joined row is used as the source of the data for the update.  You further complicated things by using distinct in your first select query - which further demonstrates that you need to better understand your data. As a general rule, the use of DISTINCT is often an indication of a faulty schema or a faulty understanding of the schema/data - the latter is more common IME.
    Wednesday, November 6, 2013 8:35 PM
  • For each of those seven records, there are two possible MemberIDs from the EDW_Master_Eligibility table.  SQL Server chooses one of those randomly.

    The Merge statement on the other hand will not update if there are two candidate rows to update from.


    Russel Loski, MCT, MCSE Data Platform/Business Intelligence. Twitter: @sqlmovers; blog: www.sqlmovers.com

    Wednesday, November 6, 2013 8:38 PM
  • And this demonstrates the problem

    set nocount on;
    declare @master table (memberid int, name varchar(20), clientid int); 
    declare @elig table (memberid int, name varchar(20), clientid int); 
    insert @master(memberid, name, clientid) values (1, 'bob', 10030);
    insert @elig (memberid, name, clientid) values (2, 'bob', 10030);
    insert @elig (memberid, name, clientid) values (3, 'bob', 10030);
    insert @elig (memberid, name, clientid) values (4, 'bob', 10030);
    set nocount off;
    select distinct A.memberid, S.memberid
    FROM @master as S 
    inner join @elig as A 
    on S.name = A.name
    where S.clientid = 10030 and A.clientid = 10030;
    update S 
    set memberid = A.memberid
    FROM @master as S 
    inner join @elig as A 
    on S.name = A.name
    where S.clientid = 10030 and A.clientid = 10030;
    select * from @master;

    Wednesday, November 6, 2013 8:41 PM
  • Let's first establish what the correct number of rows is that should be updated: 499 or 506. The following query will tell you:

    SELECT COUNT(*)
    FROM [SDC].[EDW_Master_Onsite2012] AS s
    WHERE EXISTS (
      SELECT *
      FROM [ETL].[EDW_Master_Eligibility] a 
      WHERE a.FirstName= s.First_Name
      AND   a.LastName = s.Last_Name
      AND   a.DOB      = s.BirthDate
      AND   a.ClientID = '10030'
    )
    AND   s.ClientID = '10030'
    

    Also, you are currently using the proprietary UPDATE ... FROM syntax, which is not needed in this case. It would be interesting to see if the standard SQL version of this UPDATE updates the expected number of rows. See below:

    UPDATE SDC.EDW_Master_Onsite2012
    SET MemberID = (
      SELECT a.MemberID
      FROM ETL.EDW_Master_Eligibility a 
      WHERE a.FirstName = EDW_Master_Onsite2012.First_Name
      AND   a.LastName  = EDW_Master_Onsite2012.Last_Name
      AND   a.DOB       = EDW_Master_Onsite2012.BirthDate
      AND   a.ClientID  = '10030'
    )
    WHERE ClientID = '10030'
    AND EXISTS (
      SELECT a.MemberID
      FROM ETL.EDW_Master_Eligibility a 
      WHERE a.FirstName = EDW_Master_Onsite2012.First_Name
      AND   a.LastName  = EDW_Master_Onsite2012.Last_Name
      AND   a.DOB       = EDW_Master_Onsite2012.BirthDate
      AND   a.ClientID  = '10030'
    )
    


    Gert-Jan

    Wednesday, November 6, 2013 8:42 PM
  • Thank you so much Russ. :-)
    Wednesday, November 6, 2013 8:43 PM
  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions AND formatting rules (you failed). Temporal data should use ISO-8601 formatS. Code should be in Standard SQL as much as possible AND not local dialect. 

    This is minimal polite behavior on SQL forumS. What you did post is garbage SQL. 

    >> I have two tables, one is EDW_Master_Eligibility table AND other is EDW_Master_Onsite2012 table. I am joining two tables to get MemberID from EDW_Master_Eligibility table. <<

    The term MASTER is from the old magnetic tape file and network data bases. It has no place RDBMS. We would never see a year in a table name; we used them in mag tape file labels in the 1950'S. 

    Using SELECT DISTINCT in a query is valid but very rare in a normalized schemA. Gee, wish we had DDL. Data element names do not change from table to table. Is it “dob” or “birth_date”?  The ANSI/ISO is “birth_date” by ISO 11179 rules.

    No SQL programmer would use the 1970's proprietary UPDATE..FROM.. syntax. It is absurd AND does not work. It does not tell you when it has more than one match! It uses the last one in PHYSICAL disk order! This has been documented for years! Your data is a total mess now!  

    A competent SQL programmer might have done this, but since you did not post DDL this is a guess. 

    MERGE SDC.EDW_Master_Onsite2012 AS S
    USING ETL.EDW_Master_Eligibility AS A 
    ON S.first_name = A.first_name 
       AND S.last_name = A.last_name 
       AND S.birth_date = A. birth_date
       AND S.client_id = '10030' 
       AND A.client_id = '10030'
    WHEN MATCHED
    THEN UPDATE 
    SET S.member_id = A.member_id; 

    This will find when your screwed up design finds two John Smith with the same birthdates and blow up. From memory, I think it is 100+ per day in the USA. 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Thursday, November 7, 2013 1:06 AM