Duplicate Records - Table Alias' and Joins
-
Thursday, August 27, 2009 3:27 PM
Hi All... Please consider the following tables...
Messages
MessageId MsgSequenceNumber MessageContentId
58192 239657903 1
58193 239657903 2
58194 239657903 3
MessageContents
MessageContentId MessageContent
1 xyzzy
2 ------
3 xyzzy
Think of this table having more records with different MsgSequenceNumbers. This field is used to indicate that these records have something in common. When a query results in one or more records just based on some search criteria, it's to also include those records that have something in common - same MsgSequenceNumber. Here's the query I'm working with...SELECT M2.MessageId, M2.MsgSequenceNumber, M2.MessageContentId FROM dbo.Messages AS M1 INNER JOIN dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN dbo.MessageContents AS MC1 ON M1.MessageContentId = MC1.MessageContentId INNER JOIN dbo.MessageContents AS MC2 ON M2.MessageContentId = MC2.MessageContentId WHERE (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (MC1.MessageContent LIKE '%xyzzy%')I learned the use of the M1 and M2 from this forum - thanks. It's that piece that includes those records with the same MsgSequenceNumber. And it works great when the WHERE involves columns from the Messages table - that is WHERE M1.Something = Something... But when I started to include the MessageContents table via INNER JOIN I found I was getting the wrong Content - got the same Content for all records who should have had different Content...
I found if I applied that "trick" to the MessageContents table also, it worked. But now I'm getting duplicate records. That query results with the following...
58192 239657903 58236
58192 239657903 58236
58193 239657903 58237
58193 239657903 58237
58194 239657903 58238
58194 239657903 58238
Any thoughts on how to get rid of the duplicates? (I hope this isnt too confusing...)
-- Curt
All Replies
-
Thursday, August 27, 2009 3:39 PMModeratorDid you try "SELECT DISTINCT ..."?
AMB -
Thursday, August 27, 2009 3:42 PMThis is a little difficult to understand...why are you joining messages and contents twice?
-
Thursday, August 27, 2009 4:05 PMModeratorJeez... XYZZY... I haven't seen that word in almost 30 years: http://en.wikipedia.org/wiki/Colossal_Cave_Adventure
I'm also a little fuzzy on what you're looking for. Can you post more sample data and re-post your query and sample output? The query you posted has M2.MessageContentID as the 3rd column, but your sample output seems to show another MessageID value rather than a MessageContentID value, but I may just not know your data that well.
--Brad (My Blog) -
Friday, August 28, 2009 3:39 PM
Thanks, Brad... XYZZY - Yeah, well... I guess I AM dating myself a little. Thanks for reminding me where I got that from. Least I didn't say Hello World.
Sorry about the confusion - I guess I did doctor it a little before posting. Two tables of interest are called Messages and MessageContents. Messages contains pretty much properties of a message plus an "Id" of a record in MessageContents.
There are groups of messages that are somewhat related. Messages that are related to each other will have the same value in the Messages.MsgSequenceNumber column.
We have some C# that generates SQL queries based on some "search" dialog - the user picks and chooses different properties, etc. using dropdowns, checkboxes, radio buttons, etc. We're to return not only records that match the user's search criteria, but also any other records (messages) that are related - records that share the same value in Messages.MsgSequenceNumber.
The queries below employ a techique I learned here on this forum to include those "related" records. This involves the use of aliasing and the tables "M1" and "M2" below in the queries.
Here's a query showing the basic data with which we'll be working - note it includes data from Messages and MessageContents.
SELECT TOP (100) PERCENT dbo.Messages.LogDateTime, dbo.Messages.MessageId, dbo.MessageContents.MessageContent, dbo.Messages.MsgSequenceNumber, dbo.Messages.MessageKey FROM dbo.Messages INNER JOIN dbo.MessageContents ON dbo.Messages.MessageContentId = dbo.MessageContents.MessageContentId WHERE (dbo.Messages.MessageId = 59408) OR (dbo.Messages.MessageId = 59409) OR (dbo.Messages.MessageId = 59410) OR (dbo.Messages.MessageId = 59487) OR (dbo.Messages.MessageId = 59488) OR (dbo.Messages.MessageId = 59489) OR (dbo.Messages.MessageId = 59532) OR (dbo.Messages.MessageId = 59533) OR (dbo.Messages.MessageId = 59534) ORDER BY dbo.Messages.MessageId8/28/2009 8:57:23 AM 59408 DQ.DQ 249300617 DQ 8/28/2009 8:57:24 AM 59409 ***** 08/28/2009 08:53:34 Message Accepted 249300617 NULL 8/28/2009 8:57:25 AM 59410 MRI-7619399NCIC 5261 08/28/2009 08:53:341 249300617 NULL 8/28/2009 9:41:58 AM 59487 DQ. 249300785 DQ 8/28/2009 9:41:59 AM 59488 ***** 08/28/2009 09:38:09 Message Accepted 249300785 NULL 8/28/2009 9:41:59 AM 59489 MRI-7619399NCIC 5261 08/28/2009 09:38:091234 249300785 NULL 8/28/2009 10:24:13 AM 59532 DQ.PA1234567.TXTOLN/XYZZY 249300916 DQ 8/28/2009 10:24:13 AM 59533 ***** 08/28/2009 10:20:25 Message Accepted 249300916 NULL 8/28/2009 10:24:14 AM 59534 MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234 249300916 NULL
When I use queries that use columns from the M1 table, everything seems to work ok. I get the records that meet criteria and their "related" records. For instance when I query for records where MessageKey is "DQ"
SELECT CONVERT(varchar, M2.LogDateTime, 120) AS LogDateTime, M2.MessageId, MC1.MessageContent, M2.MsgSequenceNumber, M2.MessageKey FROM dbo.Messages AS M1 INNER JOIN dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN dbo.MessageContents AS MC1 ON MC1.MessageContentId = M2.MessageContentId WHERE (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (M1.MessageKey = 'DQ') AND (M1.UserId = 88) AND (M1.StationId = 70)2009-08-28 08:57:23 59408 DQ.DQ 249300617 DQ 2009-08-28 08:57:24 59409 ***** 08/28/2009 08:53:34Message Accepted 249300617 NULL 2009-08-28 08:57:25 59410 MRI-7619399NCIC 5261 08/28/2009 08:53:341 249300617 NULL 2009-08-28 09:41:58 59487 DQ. 249300785 DQ 2009-08-28 09:41:59 59488 ***** 08/28/2009 09:38:09Message Accepted 249300785 NULL 2009-08-28 09:41:59 59489 MRI-7619399NCIC 5261 08/28/2009 09:38:091234 249300785 NULL 2009-08-28 10:24:13 59532 DQ.PA1234567.TXTOLN/XYZZY 249300916 DQ 2009-08-28 10:24:13 59533 ***** 08/28/2009 10:20:25Message Accepted 249300916 NULL 2009-08-28 10:24:14 59534 MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234 249300916 NULL
Note that the table MessageContents (MC1) is joined to the Messages table. When I take that same query and instead of searching for a column from the Messages (M1) table I search for somethign in the MessageContents (MC1) table, it doesnt work as desired. Insteat of getting messages "related" to the found records, but it's duplicating them. Here's we're searching for "XYZZY" to appear in the MessageContents table.
SELECT CONVERT(varchar, M2.LogDateTime, 120) AS LogDateTime, M2.MessageId, MC1.MessageContent, M2.MsgSequenceNumber, M1.MessageKey FROM dbo.Messages AS M1 INNER JOIN dbo.Messages AS M2 ON M2.MsgSequenceNumber = M1.MsgSequenceNumber INNER JOIN dbo.MessageContents AS MC1 ON MC1.MessageContentId = M2.MessageContentId WHERE (CONVERT(varchar, GETDATE(), 101) = CONVERT(varchar, M1.LogDateTime, 101)) AND (MC1.MessageContent LIKE '%xyzzy%') AND (M1.UserId = 88) AND (M1.StationId = 70)
2009-08-28 10:24:13 59532 DQ.PA1234567.TXTOLN/XYZZY 249300916 DQ 2009-08-28 10:24:13 59532 DQ.PA1234567.TXTOLN/XYZZY 249300916 NULL 2009-08-28 10:24:13 59532 DQ.PA1234567.TXTOLN/XYZZY 249300916 NULL 2009-08-28 10:24:14 59534 MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234 249300916 DQ 2009-08-28 10:24:14 59534 MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234 249300916 NULL 2009-08-28 10:24:14 59534 MRI-7619399NCIC 5261 XYZZY 08/28/2009 10:20:251234 249300916 NULL
I hope this is a little more clear. The data do match the queries! Not sure if you're figured out yet, but I'm old C/C++/C# mechanic. I know enough SQL to get myself into trouble. BTW, this forum has been a huge help with that - it's the best forum under MSDN's umbrella in my opinion.
Curt

