none
how to delete duplicate in all columns ? RRS feed

  • Question

  • how to delete duplicate in all columns ?

    example:

    column : number      address    before                            number       address     after

                       1            street1                                                1             street1

                       2            street1

                       3            street1

                       4            street1

    help query

    Sunday, October 14, 2012 4:02 PM

Answers

  • You don't need a trigger - the simplest solution is to make a unique index on the Address column. Once you fixed the table to get rid of existing duplicates and added a new unique index, no duplicates will ever enter the table anymore.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    Sunday, October 14, 2012 6:07 PM
    Moderator
  • Hello Srajmuneer,

      Please try the below code. This will keep the min of row number of duplicate address in the table and delete the rest of records.

    DECLARE @Table TABLE
    (
    Number int
    ,address varchar(50)
    )
    INSERT INTO @table
    SELECT 1,'Address1'
    UNION
    SELECT 2,'Address1'
    UNION
    SELECT 3,'Address1'
    UNION
    SELECT 4,'Address1'
    DELETE T
      FROM @Table T
      LEFT JOIN (SELECT address,MinNumber=MIN(Number) 
              FROM @Table
    		GROUP BY address) X
        ON X.MinNumber = T.Number
     WHERE X.address is null
    SELECT *
      FROM @Table


    Best Regards Sorna

    Sunday, October 14, 2012 5:05 PM
  • Try -

    DECLARE @tab TABLE(Number SMALLINT,Addres VARCHAR(50))
    INSERT INTO @tab
    SELECT 1,'Address1'UNION ALL
    SELECT 2,'Address1'UNION ALL
    SELECT 3,'Address1'UNION ALL
    SELECT 4,'Address1'
    
    SELECT * FROM @tab;
    	
    DELETE FROM @tab
    WHERE Number IN
    (
      SELECT a.Number
      FROM(
      SELECT Number,Addres,ROW_NUMBER() OVER(PARTITION BY  Addres ORDER BY Number) as rn
      FROM @tab
      ) a
      WHERE a.rn <>1
    )
    
    SELECT * FROM @tab


    Narsimha

    Sunday, October 14, 2012 6:07 PM
  • If your table is hostile enough that you have complete row duplicates, a more general way to delete in SQL Server 2005 onwards is to delete using ROW_NUMBER(), similar to what naarasimha showed, but using the row number directly rather than resolving to some column in the data.

    CREATE TABLE tbl(number int,address varchar(50));
    INSERT tbl values
      (1,'Address1'),
      (2,'Address1'),
      (3,'Address1'),
      (4,'Address1'),
      (5,'Address2'),
      (5,'Address2');
    
    ;WITH T as (
      select *, rn=row_number() over (partition by address order by number)
      from tbl)
    delete T
    where rn>1;
    
    select * from tbl;
    In the partition by clause, I have used the single field "address".  If you had 4 columns that together define the "uniqueness of the row to be kept", you would list all 4 columns in there, e.g. partition by col1,col2,col3,col4 order by <something that tie-breaks, or just col1>

    Sunday, October 14, 2012 8:08 PM
  • declare @t table(number int,address varchar(10)); insert into @t select 1,'street1' union all select 2,'street1' union all select 3,'street1' union all select 4,'street1' union all select 1,'street2' union all select 2,'street2' union all select 5,'street3'

    select t.number,t.address from @t t where t.number = (select top 1 number from @t where address = t.address order by number)

    --OR

    select distinct MIN(number) over (partition by address) as number, address

    RESULTS: ----------------------- number address 1 street1 1 street2 5 street3



    ESHANI. Please click "Mark As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you


    • Edited by Eshani Rao Monday, October 15, 2012 3:43 AM
    • Marked as answer by srajmuneer Tuesday, May 23, 2017 6:25 PM
    Monday, October 15, 2012 12:19 AM
  • Hi,

    DECLARE @TMP TABLE (number INT,address VARCHAR(10))
    INSERT INTO @TMP SELECT 1, 'street1'
    INSERT INTO @TMP SELECT 2, 'street1'
    INSERT INTO @TMP SELECT 3, 'street1'
    INSERT INTO @TMP SELECT 4, 'street1'
    ;WITH Delete_Duplicate_Row_cte
         AS (SELECT ROW_NUMBER()OVER(PARTITION BY address ORDER BY address) ROW_NUM,*
             FROM   @TMP ) 
    DELETE FROM Delete_Duplicate_Row_cte WHERE  ROW_NUM > 1
    SELECT * FROM @TMP


    sathyas

    Monday, October 15, 2012 7:30 AM
    Moderator
  • Monday, October 15, 2012 7:41 AM
    Moderator
  • Try this?

    SELECT MAX(NUMBER) AS NUMBER
    ,MAX(ADDRESS) AS ADDRESS
    INTO TABLENAME_TEMP
    FROM TABLENAME
    GROUP BY ADDRESS
    
    DELETE FROM TABLENAME
    WHERE NUMBER NOT IN (SELECT NUMBER FROM TABLENAME_TEMP)

    Voila.


    Monday, October 15, 2012 8:28 AM

All replies

  • Hello Srajmuneer,

      Please try the below code. This will keep the min of row number of duplicate address in the table and delete the rest of records.

    DECLARE @Table TABLE
    (
    Number int
    ,address varchar(50)
    )
    INSERT INTO @table
    SELECT 1,'Address1'
    UNION
    SELECT 2,'Address1'
    UNION
    SELECT 3,'Address1'
    UNION
    SELECT 4,'Address1'
    DELETE T
      FROM @Table T
      LEFT JOIN (SELECT address,MinNumber=MIN(Number) 
              FROM @Table
    		GROUP BY address) X
        ON X.MinNumber = T.Number
     WHERE X.address is null
    SELECT *
      FROM @Table


    Best Regards Sorna

    Sunday, October 14, 2012 5:05 PM
  • create table #t1 (number int, address nvarchar(max))  // make empty temp table for load result without dublicate
    insert into #t1 (number, address) select min(number), address from yourtable group by address   //inserting data
    truncate table yourtable  // cleaning initial table
    insert into yourtable (number, address) select number,address from #t1  // all data from temp table to our table
    go

    then now into your table is what you need, but better to use trigger for don't have this duplicates like this (sintaxis may be with the errors but idea you can see)

    CREATE TRIGGER myTrigger
    ON MyTable
    AFTER INSERT, UPDATE
    AS

    begin

    if (select count(*) from Mytable where address=inserted.address) >0 RAISERROR ('No dublicate!', 16, 10);

    end;

    go


    Sunday, October 14, 2012 5:18 PM
  • Try -

    DECLARE @tab TABLE(Number SMALLINT,Addres VARCHAR(50))
    INSERT INTO @tab
    SELECT 1,'Address1'UNION ALL
    SELECT 2,'Address1'UNION ALL
    SELECT 3,'Address1'UNION ALL
    SELECT 4,'Address1'
    
    SELECT * FROM @tab;
    	
    DELETE FROM @tab
    WHERE Number IN
    (
      SELECT a.Number
      FROM(
      SELECT Number,Addres,ROW_NUMBER() OVER(PARTITION BY  Addres ORDER BY Number) as rn
      FROM @tab
      ) a
      WHERE a.rn <>1
    )
    
    SELECT * FROM @tab


    Narsimha

    Sunday, October 14, 2012 6:07 PM
  • You don't need a trigger - the simplest solution is to make a unique index on the Address column. Once you fixed the table to get rid of existing duplicates and added a new unique index, no duplicates will ever enter the table anymore.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    Sunday, October 14, 2012 6:07 PM
    Moderator
  • If your table is hostile enough that you have complete row duplicates, a more general way to delete in SQL Server 2005 onwards is to delete using ROW_NUMBER(), similar to what naarasimha showed, but using the row number directly rather than resolving to some column in the data.

    CREATE TABLE tbl(number int,address varchar(50));
    INSERT tbl values
      (1,'Address1'),
      (2,'Address1'),
      (3,'Address1'),
      (4,'Address1'),
      (5,'Address2'),
      (5,'Address2');
    
    ;WITH T as (
      select *, rn=row_number() over (partition by address order by number)
      from tbl)
    delete T
    where rn>1;
    
    select * from tbl;
    In the partition by clause, I have used the single field "address".  If you had 4 columns that together define the "uniqueness of the row to be kept", you would list all 4 columns in there, e.g. partition by col1,col2,col3,col4 order by <something that tie-breaks, or just col1>

    Sunday, October 14, 2012 8:08 PM
  • declare @t table(number int,address varchar(10)); insert into @t select 1,'street1' union all select 2,'street1' union all select 3,'street1' union all select 4,'street1' union all select 1,'street2' union all select 2,'street2' union all select 5,'street3'

    select t.number,t.address from @t t where t.number = (select top 1 number from @t where address = t.address order by number)

    --OR

    select distinct MIN(number) over (partition by address) as number, address

    RESULTS: ----------------------- number address 1 street1 1 street2 5 street3



    ESHANI. Please click "Mark As Answer" if a post solves your problem or "Vote As Helpful" if a post has been useful to you


    • Edited by Eshani Rao Monday, October 15, 2012 3:43 AM
    • Marked as answer by srajmuneer Tuesday, May 23, 2017 6:25 PM
    Monday, October 15, 2012 12:19 AM
  • Hi,

    DECLARE @TMP TABLE (number INT,address VARCHAR(10))
    INSERT INTO @TMP SELECT 1, 'street1'
    INSERT INTO @TMP SELECT 2, 'street1'
    INSERT INTO @TMP SELECT 3, 'street1'
    INSERT INTO @TMP SELECT 4, 'street1'
    ;WITH Delete_Duplicate_Row_cte
         AS (SELECT ROW_NUMBER()OVER(PARTITION BY address ORDER BY address) ROW_NUM,*
             FROM   @TMP ) 
    DELETE FROM Delete_Duplicate_Row_cte WHERE  ROW_NUM > 1
    SELECT * FROM @TMP


    sathyas

    Monday, October 15, 2012 7:30 AM
    Moderator
  • Monday, October 15, 2012 7:41 AM
    Moderator
  • Try this?

    SELECT MAX(NUMBER) AS NUMBER
    ,MAX(ADDRESS) AS ADDRESS
    INTO TABLENAME_TEMP
    FROM TABLENAME
    GROUP BY ADDRESS
    
    DELETE FROM TABLENAME
    WHERE NUMBER NOT IN (SELECT NUMBER FROM TABLENAME_TEMP)

    Voila.


    Monday, October 15, 2012 8:28 AM