none
MS Access - Deleting duplicated rows RRS feed

  • Question

  • Ciao Ragazzi,

    So far coudn't find a way to solve this problem on deleting duplicated rows on an MS Access 2010 table that I have.

    I've a table where I store infection events from workstation based on AV information, the table has information about the workstation, path, date/month/time malware and some other stuff.

    Some times don't ask me why, those events are duplicated. In essence, all the columns match except by the unique ID (Primary Key) which is provided by access when they are loaded to the table.

    I have a Duplicated query that shows me all the duplicated rows with their information and their unique ID (The row ID which is the Primary Key is not duplicated and not used on the query, just shown).

    So let's say that rows with the ID's 1234, 4321, 987 and 789 are duplicated and shown in the query. I want to remove all duplicated rows except one so it will no longer be a duplicated event.

    In the end of each month I generate reports based on the number of events for that month and it impacts directly on those reports.

    Could you guys help me this issue?

    That's what I want to do:

    So let's say that rows with the ID's 1234, 4321, 987 and 789 are duplicated and shown in the query. I want to remove all duplicated rows except one so it will no longer be a duplicated event.

    If I'm not clear enough and you did not understand something, reply to me, I will gladly put it in another way for your precious help.

    Thanks and sorry in advance for any mistake and for your help.


    • Edited by H Souza Thursday, December 17, 2015 3:32 AM
    • Moved by George123345 Friday, December 18, 2015 6:42 AM
    Thursday, December 17, 2015 3:31 AM

Answers

  • The following is a simple example of a 'delete' query which will duplicate all rows bar one of those rows with duplicate LastName values:

    DELETE *
    FROM Contacts AS C1
    WHERE ContactID <>
         (SELECT MAX(ContactID)
          FROM Contacts AS C2
          WHERE C2.LastName = C1.LastName);

    It works by virtue of the subquery restricting the outer query to those rows per LastName value apart from those with the highest (MX) value of the primary key ContactID.  You could easily correlate the sbquery on multiple columns by means of AND opeartions in the WHERE clause, e.g if the duplication is on first AND last names:

    DELETE *
    FROM Contacts AS C1
    WHERE ContactID <>
         (SELECT MAX(ContactID)
          FROM Contacts AS C2
          WHERE C2.LastName = C1.LastName
          AND C2.FirstName = C1.FirstName);

    Ken Sheridan, Stafford, England

    Saturday, December 19, 2015 12:55 PM

All replies

  • Hi,

    May I know how did you confirm that ID's 1234, 4321, 987 and 789 are duplicated? Check it manually or use macro?

    This is the forum to discuss questions and feedback for Microsoft Office, I'll move your question to the MSDN forum for Access

    http://social.msdn.microsoft.com/Forums/en-US/home?forum=accessdev&filter=alltypes&sort=lastpostdesc

    The reason why we recommend posting appropriately is you will get the most qualified pool of respondents, and other partners who read the forums regularly can either share their knowledge or learn from your interaction with us. Thank you for your understanding.

    George Zhao
    TechNet Community Support


    Please mark the reply as an answer if you find it is helpful.

    If you have feedback for TechNet Support, contact tnmff@microsoft.com.
    Friday, December 18, 2015 6:42 AM
  • Thanks for the reply and the tips on the forum George.

    I use a "Find Duplicates query" in MS Access for the table that I want, the query compare the columns below in each row:

    Workstation | Detected Object | File | Detection Time | Result

    Ignore the DATE/MONTH column please.

    Saturday, December 19, 2015 2:42 AM
  • The following is a simple example of a 'delete' query which will duplicate all rows bar one of those rows with duplicate LastName values:

    DELETE *
    FROM Contacts AS C1
    WHERE ContactID <>
         (SELECT MAX(ContactID)
          FROM Contacts AS C2
          WHERE C2.LastName = C1.LastName);

    It works by virtue of the subquery restricting the outer query to those rows per LastName value apart from those with the highest (MX) value of the primary key ContactID.  You could easily correlate the sbquery on multiple columns by means of AND opeartions in the WHERE clause, e.g if the duplication is on first AND last names:

    DELETE *
    FROM Contacts AS C1
    WHERE ContactID <>
         (SELECT MAX(ContactID)
          FROM Contacts AS C2
          WHERE C2.LastName = C1.LastName
          AND C2.FirstName = C1.FirstName);

    Ken Sheridan, Stafford, England

    Saturday, December 19, 2015 12:55 PM
  • You might consider changing the primary key on your table to prevent future duplicates. Presumably some combination of columns should be unique- that's how you are identifying duplicates. So make that your PK and eliminate the unique id number. Maybe (InfectionDateTime, Workstation, File, DetectedObject)? Or if other relationships in your database use the unique id, you could add a unique index on the appropriate columns. Access would still prevent any duplicate values in that index, even if it's not the PK.

    Paul

    Saturday, December 19, 2015 2:05 PM