none
SQL Finding, compering and deleting some data from table

    Domanda

  • Hello, I'm looking for a solution or an idea of how to delete data from sql table. i've got 5 columns populated with data. I want to search the table, row by row. If any 4 out of 5 values in the row are identical with any other row it should delete one of them. LEt's say one row could be Alex, John, Mark, Paul and Peter. The other one could be Alex, John, Samuel, Paul and Peter. This is 4 same values in 5 columns. It doesn't matter where would 'Samuel' be at. Or any other name instead of Samuel. It could be any place as long as those other four woud match and one row would be deleted. Any ideas? Thanks in advance. JL  

    • Spostato Bob BeaucheminMVP giovedì 2 giugno 2011 17:37 Moved to a more relevent forum (From:.NET Framework inside SQL Server)
    giovedì 2 giugno 2011 16:05

Tutte le risposte

  • Do you want the field names match also or it doesn't matter which field matches to another value from another field?

    Here is what I suggest:

     

    ;with cte as (select *, row_number() over (order by (select 0)) as RecordID from DataTable),
    
    cte1 as (select RecordID, FieldVal from cte UNPIVOT (FieldVal for ColValue in ([Field1],[Field2],[Field3],[Field4],[Field5])) unpvt),
    
    cte2 as (select *, dense_rank() over (partition by FieldVal order by RecordID) as Rank from cte1)
    
    select * from cte2 where [Rank]>=4
    

     

    The above is from the top of my head - may need some tweaking.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    giovedì 2 giugno 2011 20:54
    Moderatore
  • Hello, thanks for responding. I've tryed this and it is giving me an error. 

    Msg 156, Level 15, State 1, Line 7

    Incorrect syntax near the keyword 'as'.  I'm new in SQL and this seems to be a strong coffee for me so far. Here is what I did..

     

    cte1 as(select ID, FieldVal from cte UNPIVOT(FieldVal for ColValue in ([FirstNumber],[SecondNumber], [ThirdNumber], [FourthNumber],[FifthNumber]))

    cte2 as(select *, DENSE_RANK() over(PARTITION by FieldVal order by ID)as Rank from cte1)

    select * from cte2 where [RANK] >=4                        cte, cte1 and cte2 as well as AS by cte2 and [RANK] are underlined. Looks like I am missing something. :( Help would be so much appreciated...

    ;with cte as(select*,ROW_NUMBER()over(order by(select0))as ID from tblNUMBERS_2),

    venerdì 3 giugno 2011 11:32
  • Are you running the same query provided by Naomi ? I think you are not running the same query....

     

    ;with cte as (select *, row_number() over (order by (select 0)) as RecordID from DataTable),
    cte1 as (select RecordID, FieldVal from cte UNPIVOT (FieldVal for ColValue in ([Field1],[Field2],[Field3],[Field4],[Field5])) unpvt),
    cte2 as (select *, dense_rank() over (partition by FieldVal order by RecordID) as Rank from cte1)
    select * from cte2 where [Rank]>=4
    
    

     

    For learning CTE see link

    http://msdn.microsoft.com/en-us/library/ms175972.aspx


    If this answer is helpful to you .. Please mark as Answer....
    venerdì 3 giugno 2011 11:43
  • Yes, I am not. I mean it mismatched my text in here and for some reason it even didn't take all. :) Thanks for the link. I'm on it. Gotta figyre it out no matter what. :)
    venerdì 3 giugno 2011 12:13
  • I re-reviewed my original code and I don't see errors, but I think you're missing the very first CTE declaration in your code (the one that starts with ;WITH)

    For more information on CTE check my blog post

    CTE and hierarchical queries
    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    venerdì 3 giugno 2011 12:25
    Moderatore
  • Thank again you very for responding. I just come back from the hospital and see here your post. Really appreciate this. Even more if you saying it works. I've got find the error in my code. I've been on pain killers for a month now (achilles tendon rupture) and your help is huge. Thanks.

    venerdì 3 giugno 2011 16:48
  •  

    My table:

                   COL1    COL2    COL3    COL4     COL5

    row1      value1  value2  value3  value4  value5                                           

    row2      value2  value3  value4  value5  value6                                            

    row3      value1  value4  value5  value8  value9                                            

    row4      value1  value2  value7  value8  value9                                            

    row5      value3  value4  value5  value6  value7                                            

    row6      value1  value2  value9  value4  value5                                            

    row7      value2  value3  value4  value5  value11                                           

    row8      value4  value5 value3  value9  value1           

    Output:

    6              value1                                                   4

    8              value1                                                   5

    6              value2                                                   4

    7              value2                                                   5

    7              value3                                                   4

    8              value3                                                   5

    5              value4                                                   4

    6              value4                                                   5

    7              value4                                                   6

    8              value4                                                   7

    5              value5                                                   4

    6              value5                                                   5

    7              value5                                                   6

    8              value5                                                   7

    8              value9                                                   4                        

     

    Well, I may not understand this but this is what I'm getting as an output. What I was more looking for was, if you look at the My table again:      

    My table:

                   COL1    COL2    COL3    COL4     COL5

    row1      value1  value2  value3  value4  value5                                           

    row2      value2  value3  value4  value5  value6                                            

    row3      value1  value4  value5  value8  value9                                            

    row4      value1  value2  value7  value8  value9                                            

    row5      value3  value4  value5  value6  value7                                            

    row6      value1  value2  value9  value4  value5                                            

    row7      value2  value3  value4  value5  value11                                           

    row8      value4  value5 value3  value9  value1        

    crossed rows would be deleted.    If there would be row #9 - let's pretend :

    value9  value1  value8  value6  value5  - this one would be deleted as well because we can find value9, value1, value8 and value5 in row #3. Code what u provided worked fine but it wasn't exactly the output I was looking for. Maybe I will need to tweak it a little more  but for today I've had enough. :( But anyway thank you.. 

    JL.  


    WELL, I NEED TO EDIT THIS POST, IT WOULDN'T TAKE THE CROSSED ROWS. :( THE CROSSED ROWS WOULD BE ROW2, ROW6, ROW7, ROW8 - EVENTUALLY DELETED..

    ANOTHER WORDS, THIS WOULD BE OUR OUTPUT:

    row1      value1  value2  value3  value4  value5                                           

    row3      value1  value4  value5  value8  value9                                            

    row4      value1  value2  value7  value8  value9                                            

    row5      value3  value4  value5  value6  value7                

    venerdì 3 giugno 2011 20:02
  • Hello Naomi, you're so sweet. Here is what i've got. I did this "value1 or so" table TEST just for testing this so I wouldn't be messing with my real table. So here it is:

     

    USE TEST1

     

    GO

     

    CREATE TABLE tblTEST (

    ID int unique not null,

    FirstColumn Char(50) not null,

    SecondColumn Char(50) not null,

    ThirdColumn Char(50) not null,

    FourthColumn Char(50) not null,

    FifthColumn Char(50) not null,)

     

    INSERT INTO tblTEST(ID, FirstColumn,SecondColumn,ThirdColumn,FourthColumn, FifthColumn)

    values('1','value1','value2','value3','value4','value5')

    values('2','value2','value3','value4','value5','value6')

    values('3','value1','value4','value5','value8','value9')

    values('4','value1','value2','value7','value8','value9')

    values('5','value3','value4','value5','value6','value7')

    values('6','value1','value2','value9','value4','value5')

    values('7','value2','value3','value4','value5','value11')

    values('8','value4','value5','value3','value9','value1')

     

    And here is what I do in my c# code:

     

    da.InsertCommand = new SqlCommand("INSERT INTO TABLE_NAME VALUES(@FirstCol, @SecondCol, @ThirdCol, @FourthCol, @FifthCol, @Date)", cs);

     

    da.InsertCommand.Parameters.Add("@FirstCol", SqlDbType.Char).Value = txtFirstNum.Text;

    da.InsertCommand.Parameters.Add("@SecondCol", SqlDbType. Char).Value = txtSecondNum.Text;

    da.InsertCommand.Parameters.Add("@ThirdCol", SqlDbType. Char).Value = txtThirdNum.Text;

    da.InsertCommand.Parameters.Add("@FourthCol", SqlDbType. Char).Value = txtFourthNum.Text;

    da.InsertCommand.Parameters.Add("@FifthCol", SqlDbType. Char).Value = txtFifthNum.Text;

     

     

    SqlParameter p1 = da.InsertCommand.Parameters.Add("@Date", SqlDbType.DateTime);

    p1.Value = dateTimePicker1.Value;

     

    cs.Open();

    da.InsertCommand.ExecuteNonQuery();

    cs.Close();

     

    Your ideas are highly appreciated. Thank you.  

    JL
    sabato 4 giugno 2011 09:17
  • Here is the solution:

    CREATE TABLE tblTEST (
    ID int unique not null,
    FirstColumn Char(50) not null,
    SecondColumn Char(50) not null,
    ThirdColumn Char(50) not null,
    FourthColumn Char(50) not null,
    FifthColumn Char(50) not null,)
     
    INSERT INTO tblTEST(ID, FirstColumn,SecondColumn,ThirdColumn,FourthColumn, FifthColumn)
    values('1','value1','value2','value3','value4','value5'),
    ('2','value2','value3','value4','value5','value6'),
    ('3','value1','value4','value5','value8','value9'),
    ('4','value1','value2','value7','value8','value9'),
    ('5','value3','value4','value5','value6','value7'),
    ('6','value1','value2','value9','value4','value5'),
    ('7','value2','value3','value4','value5','value11'),
    ('8','value4','value5','value3','value9','value1')
    
    ;with UnPvt as (select ID, ColValue from tblTEST 
    UNPIVOT (ColValue for ColName IN ([FirstColumn],[SecondColumn],[ThirdColumn],[FourthColumn], [FifthColumn])) unpvt),
    SameVals as (select * from (select *, COUNT(ID) over (partition by ColValue) as cntSame from UnPvt) X where cntSame >=2),
    DupRecs as (select T.*,S.cntDups, S.ID as DupID
     from tblTEST T CROSS APPLY (select S.ID, COUNT(*) as cntDups from SameVals S
    WHERE T.ID < S.ID and S.ColValue IN (T.FirstColumn,T.SecondColumn,T.ThirdColumn,T.FourthColumn, T.FifthColumn) GROUP BY S.ID) S)
    
    select distinct DupID as [RowIDToDelete] from DupRecs where cntDups >=4
     
    

    This works only when all 5 columns are of the same type.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    domenica 5 giugno 2011 04:47
    Moderatore
  • Hello Naomi, thank you once again for replying. It works great except one little thing. Please bear with me on this one more time.  So here is what I've got. Original table looks just like this:

    ID,  FirstColumn, SecondColumn, ThirdColumn, FourthColumn, FifthColumn

    1, value1,   value2,     value3,    value4,     value5

    2, value2,   value3,     value4,    value5,     value6

    3, value1,   value4,     value5,    value8,     value9

    4, value1,   value2,     value7,    value8,     value9

    5, value3,   value4,     value5,    value6,     value7

    6, value1,   value2,     value9,    value4,     value5

    7, value2,   value3,     value4,    value5,     value11

    8, value4,   value5,     value3,    value9,     value1

    Now, the output we are getting is not bad at all. It is:

              RowIDToDelete

    1        2

    2        5

    3        6

    4        7

    5        8

    So let's take a closer look. Row ID2 should be deleted since there are 4 values identical with row ID1. It is pretty clear (value2, value3, value4 and value5 can be found in both rows). Now let's move to row ID3 and ID4. We can't find four values in ID3 that we would find in row 1 or 2.  Same in ID4, we cannot find any four values from that in rows 1, 2 and 3. Which is good. So far so good.

    If we look at the row ID5 we see it on our list 'RowIDToDelete'. Comparing values of row 5 to previous rows we see that its values are complying with values of row 2 which will be eventually deleted. We would need one of these two to stay. It doesn't matter which one.

     I was thinking maybe when we get our first row to our list, delete it and re-run the application. Since I'm completely newbie in SQL I wasn't able to figure out much. But what I did or what I was trying to do was, I tried to find one row which would be deleted. Let's say first on the list.  

    select MIN(DupID) as [ThisDeleteFirst] from DupRecs where cntDups >= 4

    Here is what was the output:

         ThisDeleteFirst    

    1     2

    Then I was thinking I will delete this row and ru-run it again. If I would delete it manually it'd work. By manually I mean:

    DELETE FROM tblTEST WHERE ID = '2'

    But this should be taken care of not this way. So I tried this:

    DELETE MD

    FROM tblTEST MD

    WHERE EXISTS(SELECT DupID FROM DupRecs WHERE cntDups >= 4

    --WHERE EXISTS(SELECT MIN(DupID) FROM DupRecs WHERE cntDups >= 4

    GROUP BY DupID 

    HAVING MIN(DupID) = MD.ID)

    --HAVING DupID = MD.ID)

    And this is what I'm getting:  (5 row(s) affected)

    output 'SELECT * FROM tblTEST':

    ID,  FirstColumn, SecondColumn, ThirdColumn, FourthColumn, FifthColumn

    1, value1,   value2,     value3,    value4,     value5

    3, value1,   value4,     value5,    value8,     value9

    4, value1,   value2,     value7,    value8,     value9

     

    So, :( , what do you think the best aproach would be to have that missing row ID 5 up on that list? Any ideas? Like I said, this is all pretty much new for me but I am already looking forward to this. Digging deeper into those pivoting, unpivoting and so on stuff. Thanks god for your ideas.

    JL

    domenica 5 giugno 2011 17:00
  • How about:

     

    ;with UnPvt as (select ID, ColValue from tblTEST 
    UNPIVOT (ColValue 
    for ColName IN ([FirstColumn],[SecondColumn],[ThirdColumn],[FourthColumn], [FifthColumn])) unpvt),
    SameVals as (select * from (select *, COUNT(ID) over (partition by ColValue) as cntSame from UnPvt) X
     where cntSame >=2),
    DupRecs as (select T.*,S.cntDups, S.ID as DupID
     from tblTEST T CROSS APPLY (select S.ID, COUNT(*) as cntDups from SameVals S
    WHERE T.ID < S.ID and S.ColValue IN 
    (T.FirstColumn,T.SecondColumn,T.ThirdColumn,T.FourthColumn, T.FifthColumn) GROUP BY S.ID) S),
    
    Candidates as (select distinct ID,DupID from DupRecs where cntDups >=4) 
    
    select Distinct DupID from Candidates where DupID not IN (select ID from Candidates)
    

     


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    domenica 5 giugno 2011 20:14
    Moderatore
  • I found this problem quite interesting, so I blogged about it

    Finding Duplicates - Interesting twist


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    lunedì 6 giugno 2011 02:04
    Moderatore
  • Do you want the field names match also or it doesn't matter which field matches to another value from another field?

    Here is what I suggest:

     

    ;with cte as (select *, row_number() over (order by (select 0)) as RecordID from DataTable),
    
    cte1 as (select RecordID, FieldVal from cte UNPIVOT (FieldVal for ColValue in ([Field1],[Field2],[Field3],[Field4],[Field5])) unpvt),
    
    cte2 as (select *, dense_rank() over (partition by FieldVal order by RecordID) as Rank from cte1)
    
    select * from cte2 where [Rank]>=4
    

     

    The above is from the top of my head - may need some tweaking.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Was sort of curious and wanted to know why you use ";" at the beginning? Is this a common practice?

    Pérez
    lunedì 6 giugno 2011 04:59
  • Hi Naomi, I am really happy that you found this interesting. And proud as well. So, I've tried this and it's not quite  it yet. I must have confused you with what I wrote last time that we should delete any one of rows 2 or 5 since they're matching.  

    Well, here is what I'm getting as an output now: DupID = 5, 7, 8 and this is what I am supposed to be getting: DupID = 2, 6, 7, 8 or DupID = 1, 5, 6, 7, 8 if I'm not mistaken.  If we would only delete rows 5, 7 and 8 we would still have rows 1 and 2 matching. What about looping, is this something we could consider to be doing?


    lunedì 6 giugno 2011 10:33
  • Hi Perez, thanks for replying. Appreciate it. But this is not even close.. :(


    lunedì 6 giugno 2011 10:39
  • Hi Perez,

    This is a requirement for using CTE - the last statement before CTE must end with the semicolon. That's why I always add ; right before the statement.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    lunedì 6 giugno 2011 13:25
    Moderatore
  • Yeap, it really is. I appreciate your time and ideas.  If Peter could help it would be great. I really am thankful.

    George.  

    lunedì 6 giugno 2011 14:04
  • Hi. Just got home since today is our national holiday, the "4th of July" for Sweden so to speak.
    I will give it a run.
    lunedì 6 giugno 2011 14:46
  • Does this count?


    ;WITH cteSource(OriginalID, DuplicateID)
    AS (
    	SELECT		t.ID AS OriginalID,
    			a.ID AS DuplicateID
    	FROM		dbo.tblTEST AS t
    	LEFT JOIN	dbo.tblTEST AS a ON a.ID > t.ID
    				AND a.FirstColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS b ON b.ID = a.ID
    				AND b.SecondColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS c ON c.ID = a.ID
    				AND c.ThirdColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS d ON d.ID = a.ID
    				AND d.FourthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS e ON e.ID = a.ID
    				AND e.FifthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	WHERE		CASE WHEN a.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN b.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN c.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN d.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN e.ID IS NULL THEN 0 ELSE 1 END >= 4
    )
    SELECT		u.theID
    FROM		cteSource AS s
    UNPIVOT		(
    			theID
    			FOR theCol IN (s.OriginalID, s.DuplicateID)
    		) AS u
    GROUP BY	u.theID
    HAVING		MIN(u.theCol) = 'DuplicateID'
    

    • Modificato SwePesoMVP lunedì 6 giugno 2011 15:33 Note to self, test solutions after writing them
    lunedì 6 giugno 2011 15:20
  • Peter,

    The result of this query is the same as mine - and it's not what we want.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    lunedì 6 giugno 2011 15:24
    Moderatore
  • SELECT DISTINCT	a.ID AS DuplicateID
    FROM		dbo.tblTEST AS t
    LEFT JOIN	dbo.tblTEST AS a ON a.ID > t.ID
    			AND a.FirstColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS b ON b.ID = a.ID
    			AND b.SecondColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS c ON c.ID = a.ID
    			AND c.ThirdColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS d ON d.ID = a.ID
    			AND d.FourthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS e ON e.ID = a.ID
    			AND e.FifthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    WHERE		CASE WHEN a.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN b.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN c.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN d.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN e.ID IS NULL THEN 0 ELSE 1 END >= 4
    
    
    lunedì 6 giugno 2011 15:50
  • Hello Peter, almost there, the output is 2, 5, 6, 7 and 8. It should be 2, 6, 7 and 8.  Four values from row 5 are equal to four values from row 2. Since row 2 gets eliminated we are not really comparing additional rows to row 2. If i'm not mistaken.
    lunedì 6 giugno 2011 16:08
  • The only way to fix this is to have a sequence to determine "checking order", since rows in a table has no physical order, only logical order.

    What do you have to determine order of checks?

    lunedì 6 giugno 2011 16:40
  • Right. Well, this query is supposed to find four guys in two rows and if it finds that it should delete one row. What if we just find one situation like that and ignore the rest. I mean literally, what would be the best approach just to find the very first situation like that, delete one row and that's it. Mission accomplished,  stop the query. In our situation it would delete row 2 and then stopped. We would need to re-run the query to find another guys like that that would occur in two rows. In our specific situation, if it would come to row 5 it would simply leave that row intact since there's no row 2 anymore. You know what I mean? So what do you think, can we do this? If this is the possible way what do you think would be the best possibly fast way to achieve that?   

    lunedì 6 giugno 2011 18:29
  • This gave me 2, 6, 7 and 8.

    ;WITH cteSource(DuplicateID)
    AS (
    	SELECT TOP(1)	a.ID AS DuplicateID
    	FROM		dbo.tblTEST AS t
    	LEFT JOIN	dbo.tblTEST AS a ON a.ID > t.ID
    				AND a.FirstColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS b ON b.ID = a.ID
    				AND b.SecondColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS c ON c.ID = a.ID
    				AND c.ThirdColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS d ON d.ID = a.ID
    				AND d.FourthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	LEFT JOIN	dbo.tblTEST AS e ON e.ID = a.ID
    				AND e.FifthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    	WHERE		CASE WHEN a.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN b.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN c.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN d.ID IS NULL THEN 0 ELSE 1 END
    			+ CASE WHEN e.ID IS NULL THEN 0 ELSE 1 END >= 4
    	ORDER BY	a.ID
    )
    DELETE		t
    OUTPUT		deleted.ID
    FROM		dbo.tblTEST AS t
    INNER JOIN	cteSource AS s ON s.DuplicateID = t.ID
    GO 1000
    
    lunedì 6 giugno 2011 18:47
  • Here is a way to turn this into a loop, but I don't think it's very efficient - perhaps a single loop in original table will be better:

    declare @DupLoop int
    
    set @DupLoop = 1
    while @DupLoop = 1
    begin
    ;with Dups as (SELECT DISTINCT	a.ID AS DuplicateID
    FROM		dbo.tblTEST AS t
    LEFT JOIN	dbo.tblTEST AS a ON a.ID > t.ID
    			AND a.FirstColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS b ON b.ID = a.ID
    			AND b.SecondColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS c ON c.ID = a.ID
    			AND c.ThirdColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS d ON d.ID = a.ID
    			AND d.FourthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    LEFT JOIN	dbo.tblTEST AS e ON e.ID = a.ID
    			AND e.FifthColumn IN (t.FirstColumn, t.SecondColumn, t.ThirdColumn, t.FourthColumn, t.FifthColumn)
    WHERE		CASE WHEN a.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN b.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN c.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN d.ID IS NULL THEN 0 ELSE 1 END
    		+ CASE WHEN e.ID IS NULL THEN 0 ELSE 1 END >= 4)
    
    
    delete from tblTEST where ID in (select top (1) DuplicateID from Dups ORDER BY DuplicateID) 		
    select @DupLoop = @@ROWCOUNT
    end
    
    select * from tblTest
    


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    lunedì 6 giugno 2011 18:50
    Moderatore
  • Thank you. I'm impressed. It seems to be working. I will test it tomorrow. You just made me so happy, you can't even imagine....

    George.

    lunedì 6 giugno 2011 19:11
  • After testing it looks like we still have error in the result. This is what I did:
    INSERT INTO tblTEST(ID, FirstColumn,SecondColumn,ThirdColumn,FourthColumn, FifthColumn)
    values
    ('1','Alex','George','Gerard','Michael','Paul'),
    ('2','Alex','George','Gerard','Michael','Peter'),
    ('3','Alex','George','Gerard','Michael','Thomas'),
    ('4','Alex','George','Gerard','Paul','Peter'),
    ('5','Alex','George','Gerard','Paul','Thomas'),
    ('6','Alex','George','Gerard','Peter','Thomas'),
    ('7','Alex','George','Michael','Paul','Peter'),
    ('8','Alex','George','Michael','Paul','Thomas'),
    ('9','Alex','George','Michael','Peter','Thomas'),
    ('10','Alex','George','Paul','Peter','Thomas'),
    ('11','Alex','Gerard','Michael','Paul','Peter'),
    ('12','Alex','Gerard','Michael','Paul','Thomas'),
    ('13','Alex','Gerard','Michael','Peter','Thomas'),
    ('14','Alex','Gerard','Paul','Peter','Thomas'),
    ('15','Alex','Michael','Paul','Peter','Thomas'),
    ('16','George','Gerard','Michael','Paul','Peter'),
    ('17','George','Gerard','Michael','Paul','Thomas'),
    ('18','George','Gerard','Michael','Peter','Thomas'),
    ('19','George','Gerard','Paul','Peter','Thomas'),
    ('20','George','Michael','Paul','Peter','Thomas'),
    ('21','Gerard','Michael','Paul','Peter','Thomas')

    And the result is rows:1, 6, 15, 20 and 21.

    Rows with ID 20, 21 should not be in it. I've tried few things already and it didn't help. I'm going to keep tying.
    George
    giovedì 9 giugno 2011 15:35