none
Duplicate Records - Table Alias' and Joins

    Question

  • Hi All...  Please consider the following tables...

    Messages
    MessageId  MsgSequenceNumber  MessageContentId
    58192         239657903                         1
    58193         239657903                         2
    58194         239657903                         3

    MessageContents
    MessageContentId      MessageContent
    1                               xyzzy  
    2                               ------ 
    3                               xyzzy

    Think of this table having more records with different MsgSequenceNumbers.  This field is used to indicate that these records have something in common.  When a query results in one or more records just based on some search criteria, it's to also include those records that have something in common - same MsgSequenceNumber.  Here's the query I'm working with...

    SELECT        M2.MessageId, M2.MsgSequenceNumber, M2.MessageContentId
    FROM            dbo.Messages AS M1 INNER JOIN
                             dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN
                             dbo.MessageContents AS MC1 ON M1.MessageContentId = MC1.MessageContentId INNER JOIN
                             dbo.MessageContents AS MC2 ON M2.MessageContentId = MC2.MessageContentId
    WHERE        (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (MC1.MessageContent LIKE '%xyzzy%')

    I learned the use of the M1 and M2 from this forum - thanks.  It's that piece that includes those records with the same MsgSequenceNumber.  And it works great when the WHERE involves columns from the Messages table - that is WHERE M1.Something = Something...  But when I started to include the MessageContents table via INNER JOIN I found I was getting the wrong Content - got the same Content for all records who should have had different Content...

    I found if I applied that "trick" to the MessageContents table also, it worked.  But now I'm getting duplicate records.  That query results with the following...

    58192 239657903 58236
    58192 239657903 58236
    58193 239657903 58237
    58193 239657903 58237
    58194 239657903 58238
    58194 239657903 58238

    Any thoughts on how to get rid of the duplicates?  (I hope this isnt too confusing...)

    -- Curt



    Thursday, August 27, 2009 3:27 PM

All replies

  • Did you try "SELECT DISTINCT ..."?


    AMB
    Thursday, August 27, 2009 3:39 PM
  • This is a little difficult to understand...why are you joining messages and contents twice?
    Thursday, August 27, 2009 3:42 PM
  • Jeez... XYZZY... I haven't seen that word in almost 30 years:  http://en.wikipedia.org/wiki/Colossal_Cave_Adventure

    I'm also a little fuzzy on what you're looking for.  Can you post more sample data and re-post your query and sample output?  The query you posted has M2.MessageContentID as the 3rd column, but your sample output seems to show another MessageID value rather than a MessageContentID value, but I may just not know your data that well.

    --Brad (My Blog)
    Thursday, August 27, 2009 4:05 PM
  • Thanks, Brad...   XYZZY - Yeah, well...  I guess I AM dating myself a little.  Thanks for reminding me where I got that from.  Least I didn't say Hello World.

    Sorry about the confusion - I guess I did doctor it a little before posting.  Two tables of interest are called Messages and MessageContents.  Messages contains pretty much properties of a message plus an "Id" of a record in MessageContents.

    There are groups of messages that are somewhat related.  Messages that are related to each other will have the same value in the Messages.MsgSequenceNumber column.

    We have some C# that generates SQL queries based on some "search" dialog - the user picks and chooses different properties, etc. using dropdowns, checkboxes, radio buttons, etc.  We're to return not only records that match the user's search criteria, but also any other records (messages) that are related - records that share the same value in Messages.MsgSequenceNumber.

    The queries below employ a techique I learned here on this forum to include those "related" records.  This involves the use of aliasing and the tables "M1" and "M2" below in the queries.

    Here's a query showing the basic data with which we'll be working - note it includes data from Messages and MessageContents.

    SELECT        TOP (100) PERCENT dbo.Messages.LogDateTime, dbo.Messages.MessageId, dbo.MessageContents.MessageContent, 
                             dbo.Messages.MsgSequenceNumber, dbo.Messages.MessageKey
    FROM            dbo.Messages INNER JOIN
                             dbo.MessageContents ON dbo.Messages.MessageContentId = dbo.MessageContents.MessageContentId
    WHERE        (dbo.Messages.MessageId = 59408) OR
                             (dbo.Messages.MessageId = 59409) OR
                             (dbo.Messages.MessageId = 59410) OR
                             (dbo.Messages.MessageId = 59487) OR
                             (dbo.Messages.MessageId = 59488) OR
                             (dbo.Messages.MessageId = 59489) OR
                             (dbo.Messages.MessageId = 59532) OR
                             (dbo.Messages.MessageId = 59533) OR
                             (dbo.Messages.MessageId = 59534)
    ORDER BY dbo.Messages.MessageId
    8/28/2009 8:57:23 AM	59408	DQ.DQ						249300617		DQ
    8/28/2009 8:57:24 AM	59409	***** 08/28/2009 08:53:34 Message Accepted		249300617		NULL
    8/28/2009 8:57:25 AM	59410	MRI-7619399NCIC 5261 08/28/2009 08:53:341		249300617		NULL
    8/28/2009 9:41:58 AM	59487	DQ.						249300785		DQ
    8/28/2009 9:41:59 AM	59488	***** 08/28/2009 09:38:09 Message Accepted		249300785		NULL
    8/28/2009 9:41:59 AM	59489	MRI-7619399NCIC 5261 08/28/2009 09:38:091234		249300785		NULL
    8/28/2009 10:24:13 AM	59532	DQ.PA1234567.TXTOLN/XYZZY				249300916		DQ
    8/28/2009 10:24:13 AM	59533	***** 08/28/2009 10:20:25 Message Accepted		249300916		NULL
    8/28/2009 10:24:14 AM	59534	MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234	249300916		NULL
    When I use queries that use columns from the M1 table, everything seems to work ok.  I get the records that meet criteria and their "related" records.  For instance when I query for records where MessageKey is "DQ"

    SELECT        CONVERT(varchar, M2.LogDateTime, 120) AS LogDateTime, M2.MessageId, MC1.MessageContent, M2.MsgSequenceNumber, M2.MessageKey
    FROM            dbo.Messages AS M1 INNER JOIN
                             dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN
                             dbo.MessageContents AS MC1 ON MC1.MessageContentId = M2.MessageContentId
    WHERE        (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (M1.MessageKey = 'DQ') AND (M1.UserId = 88) AND (M1.StationId = 70)
    2009-08-28 08:57:23	59408	DQ.DQ						249300617		DQ
    2009-08-28 08:57:24	59409	***** 08/28/2009 08:53:34Message Accepted		249300617		NULL
    2009-08-28 08:57:25	59410	MRI-7619399NCIC 5261 08/28/2009 08:53:341		249300617		NULL
    2009-08-28 09:41:58	59487	DQ.						249300785		DQ
    2009-08-28 09:41:59	59488	***** 08/28/2009 09:38:09Message Accepted		249300785		NULL
    2009-08-28 09:41:59	59489	MRI-7619399NCIC 5261 08/28/2009 09:38:091234		249300785		NULL
    2009-08-28 10:24:13	59532	DQ.PA1234567.TXTOLN/XYZZY				249300916		DQ
    2009-08-28 10:24:13	59533	***** 08/28/2009 10:20:25Message Accepted		249300916		NULL
    2009-08-28 10:24:14	59534	MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234	249300916		NULL
    Note that the table MessageContents (MC1) is joined to the Messages table.  When I take that same query and instead of searching for a column from the Messages (M1) table I search for somethign in the MessageContents (MC1) table, it doesnt work as desired.  Insteat of getting messages "related" to the found records, but it's duplicating them.   Here's we're searching for "XYZZY" to appear in the MessageContents table.

    SELECT        CONVERT(varchar, M2.LogDateTime, 120) AS LogDateTime, M2.MessageId, MC1.MessageContent, M2.MsgSequenceNumber, M1.MessageKey
    FROM            dbo.Messages AS M1 INNER JOIN
                             dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN
                             dbo.MessageContents AS MC1 ON MC1.MessageContentId = M2.MessageContentId
    WHERE        (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (MC1.MessageContent LIKE '%xyzzy%') AND (M1.UserId = 88) AND 
                             (M1.StationId = 70)

    2009-08-28 10:24:13	59532	DQ.PA1234567.TXTOLN/XYZZY				249300916		DQ
    2009-08-28 10:24:13	59532	DQ.PA1234567.TXTOLN/XYZZY				249300916		NULL
    2009-08-28 10:24:13	59532	DQ.PA1234567.TXTOLN/XYZZY				249300916		NULL
    2009-08-28 10:24:14	59534	MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234	249300916		DQ
    2009-08-28 10:24:14	59534	MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234	249300916		NULL
    2009-08-28 10:24:14	59534	MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234	249300916		NULL
    I hope this is a little more clear.  The data do match the queries!  Not sure if you're figured out yet, but I'm old C/C++/C# mechanic.  I know enough SQL to get myself into trouble.  BTW, this forum has been a huge help with that - it's the best forum under MSDN's umbrella in my opinion.

    Curt

    Friday, August 28, 2009 3:39 PM