locked
Delete row with repeated data RRS feed

  • Question

  • create table T(a int, b int)

    insert into T values (1,2),(3,4),(5,6),(3,4),(5,6)

    delete from T
    from
    (
    select distinct x.a, x.b from
    (select ROW_NUMBER() over (order by t.a) n, * from T)x
    join (select ROW_NUMBER() over (order by t.a) n, * from T)y
    on y.n=x.n+1
    where x.a=y.a and x.b=y.b
    )z
    where T.a=z.a and T.b=z.b

    drop table T



    the request deletes all values (3,4),(5,6)

    and leaves only (1,2)


    Is it possible in one delete - purge data table and leave only one row..
    so the result is

    (1,2),(3,4),(5,6)

    ????
    Friday, September 11, 2009 9:01 AM

Answers

  • DECLARE @T TABLE(A INT, B INT)
    INSERT INTO @T VALUES(1,2)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    
    SELECT * FROM @T
    
    DELETE FROM A
    FROM(SELECT ROW_NUMBER() OVER(PARTITION BY A, B ORDER BY A, B) AS Row FROM @T) A
    WHERE Row > 1
    
    SELECT * FROM @T
    

    The above will remove duplicate rows, based on the row numbers order by clause.

    Dave

    • Proposed as answer by Melissa Suciadi Friday, September 11, 2009 9:38 AM
    • Marked as answer by RobbKirk Friday, September 11, 2009 9:42 AM
    Friday, September 11, 2009 9:27 AM
  • Yes, if you can give row_number for each row then you can delete by using the Use ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table

    Here is the sample sql from the article

    WITH DUPLICATES AS 
    (
    SELECT 
      RN = ROW_NUMBER() OVER (PARTITION BY Column1 ORDER BY Id Column2),
      Column1,   Column
    FROM Suppliers
    )
    DELETE FROM DUPLICATES WHERE RN > 1
    

    http://www.kodyaz.com http://www.eralper.com
    • Marked as answer by RobbKirk Friday, September 11, 2009 3:42 PM
    Friday, September 11, 2009 10:29 AM

All replies

  • ehm yes it possible use CTE
    the logic is
    assign rownumber in data if there is 2 duplicate data then the 1st one will has rownumber 1 the 2nd number 2 and ...

    and delete all that rownumber is bigger than 1

    Best Regards,
    Melissa Suciadi


    If you have found this post helpful, please click the 'Vote as Helpful ' link (the green triangle and number on the top-left).

    If this post answers your question, click the 'Mark As Answered ' link below. It helps others who experience the same issue in future to find the solution.

    Friday, September 11, 2009 9:04 AM
  • Hello RobbKirk,

    Please read the following article just right on that topic

    How to delete duplicate records or rows among identical rows in a table where no primary key exists

    There exists examples on showing how duplicate rows can be deleted using a few methods.

    I hope it helps,
    Eralper

    http://www.kodyaz.com
    http://www.eralper.com
    Friday, September 11, 2009 9:05 AM
  • Hello RobbKirk,

    Please read the following article just right on that topic

    How to delete duplicate records or rows among identical rows in a table where no primary key exists

    There exists examples on showing how duplicate rows can be deleted using a few methods.

    I hope it helps,
    Eralper
    http://www.kodyaz.com http://www.eralper.com
    Hi,

    But what if I have variable number of repeated rows?

    In that article they say TOP (N) - 1,2,3...

    What if "leave just one"?

    Friday, September 11, 2009 9:19 AM
  • DECLARE @T TABLE(A INT, B INT)
    INSERT INTO @T VALUES(1,2)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    
    SELECT * FROM @T
    
    DELETE FROM A
    FROM(SELECT ROW_NUMBER() OVER(PARTITION BY A, B ORDER BY A, B) AS Row FROM @T) A
    WHERE Row > 1
    
    SELECT * FROM @T
    

    The above will remove duplicate rows, based on the row numbers order by clause.

    Dave

    • Proposed as answer by Melissa Suciadi Friday, September 11, 2009 9:38 AM
    • Marked as answer by RobbKirk Friday, September 11, 2009 9:42 AM
    Friday, September 11, 2009 9:27 AM
  • Yes, if you can give row_number for each row then you can delete by using the Use ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table

    Here is the sample sql from the article

    WITH DUPLICATES AS 
    (
    SELECT 
      RN = ROW_NUMBER() OVER (PARTITION BY Column1 ORDER BY Id Column2),
      Column1,   Column
    FROM Suppliers
    )
    DELETE FROM DUPLICATES WHERE RN > 1
    

    http://www.kodyaz.com http://www.eralper.com
    • Marked as answer by RobbKirk Friday, September 11, 2009 3:42 PM
    Friday, September 11, 2009 10:29 AM
  • DECLARE @T TABLE(A INT, B INT)
    INSERT INTO @T VALUES(1,2)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(3,4)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    INSERT INTO @T VALUES(5,6)
    
    SELECT * FROM @T
    
    DELETE FROM A
    FROM(SELECT ROW_NUMBER() OVER(PARTITION BY A, B ORDER BY A, B) AS Row FROM @T) A
    WHERE Row > 1
    
    SELECT * FROM @T
    
    

    The above will remove duplicate rows, based on the row numbers order by clause.

    Dave



    Hi Dave and everyone,

    Thanks for your help..

    It is the solution, yes.
    The only question (of course it should be asked): 
    What happens physically when, giving the alias A, we enumerate T-table rows by row_number():-  does the Server Engine keep references to actual table data and row_number creates a temporary constraint, "non-unique temporary data column" to physical data, which exists only during execution? Is the additional "row_number" column bound to actual data, located on disk? How?
    Friday, September 11, 2009 8:23 PM
  • Up++
    Sunday, September 13, 2009 11:03 PM