Interpretation of JOIN is not clear
-
Thursday, July 12, 2012 3:42 PM
I have a query that is returning different results in StreamInsight compared to when run in LINQPad (in the default StreamInsight context)
I want to get a better understanding of what the expected results should be.
This is my query when run in LINQPad:
string SourceFile = @"C:\TextSource1.txt"; List<string> Lines = new List<string>(); char delimiter = ','; void Main() { using(StreamReader sr = new StreamReader(this.SourceFile)) { while(!sr.EndOfStream) { Lines.Add(sr.ReadLine()); } } var TextFileSource1 = from e in Lines select new { Datetime = e.Split(delimiter)[0], FQN = e.Split(delimiter)[1], Value = e.Split(delimiter)[2], Quality = e.Split(delimiter)[3], Workcell = e.Split(delimiter)[4] }; // Query Template var q = from e in TextFileSource1 where e.FQN.ToString().Trim() == "EventTag2" from e2 in TextFileSource1 where e2.Value.ToString().Trim() == "2.0" select new // { FQN=e.FQN, Value=e.Value, Datetime = e.Datetime }; { FQN=e.FQN, Value=e.Value, Datetime = e.Datetime, CountByFQN=e2.FQN }; Console.WriteLine(q); }This is the input data:
Datetime, FQN, Value, Quality, Workcell
2010-11-01 01:10:00, EventTag2, 1.0, 192, Cell1
2010-11-01 02:10:00, EventTag2, 2.0, 192, Cell1
2010-11-01 02:30:00, EventTag1, 0, 192, Cell1I get this result from LINQPad:
FQN Value Datetime CountByFQN EventTag2 1.0 11/1/2010 1:10 EventTag2 EventTag2 2.0 11/1/2010 2:10 EventTag2 When I run it in my StreamInsight app I get one row (the second one above)
So now I don't really know which is correct. I'm guessing the StreamInsight result is correct but would like to know why the LINQPad one is different
All Replies
-
Saturday, July 14, 2012 6:57 PM
Usually it has to do with the timing of the CTIs and how they are related to each other. In particular, because reference streams usually start up a couple of milliseconds after the data streams, especially in a StreamInsight application, the initial reference events won't get enqueued until slightly after the initial data events. The is especially true if you are using DateTimeOffset.Current for the start time of the event.
I would start by recording the queries in both LinqPad and the StreamInsight app. Since the events that are of the most interest are at startup (and you're in LinqPad), you'll need to use trace.cmd to start and stop the traces.
DevBiker (aka J Sawyer)
Microsoft MVP - Sql Server (StreamInsight)
If I answered your question, please mark as answer.
If my post was helpful, please mark as helpful. -
Monday, July 16, 2012 10:16 AM
Which of the two results would you say is correct, accoring to how we believe the join should be working
The one result (with just 1 row) seems to be an AND operator joining the two predicates whereas the LINQ expression seems to be an OR operator.
-
Tuesday, July 17, 2012 2:00 PM
Both. ;-)
I say that because they are likely both correct in the temporal context of the query.
Try creating the reference streams as intervals, rather than points. When enqueuing as intervals, set your start date to something absurdly far in the past ... I'll use 1/1/1900 as a start ... and an end that's absurdly far in the future ... again, I'll use 1/1/2150. Or even DateTimeOffset.MaxValue. DO NOT use DateTimeOffset.MinValue ... sometimes that causes issues if you get an underflow.
From there, when you import the CTIs from the data stream, set the AdvanceTimePolicy to Adjust. This will clip the intervals to the first CTI in the data stream. One more thing that's important to remember in this scenario ... you'll want to have your first CTI enqueued before your first event or the initial events won't overlap with the reference events.
DevBiker (aka J Sawyer)
Microsoft MVP - Sql Server (StreamInsight)
If I answered your question, please mark as answer.
If my post was helpful, please mark as helpful.- Marked As Answer by Iric WenModerator Monday, July 23, 2012 5:47 AM
-
Tuesday, July 24, 2012 1:51 PMI appreciate that when I look at the situation, I can interpret both results as being possible correct results, but I would expect the linq engine to be interpreting the particular syntax I am using one way (AND operator) or the other (OR operator). As such, I would expect only one of the results to be correct from the language perspective.
-
Tuesday, July 24, 2012 2:43 PM
Again, it's not just the LINQ semantics that you have to consider but the time dimension.
To get a clearer idea of this, use trace.cmd before you run your LinqPad query. That'll give you a trace of your entire LinqPad session.
DevBiker (aka J Sawyer)
Microsoft MVP - Sql Server (StreamInsight)
If I answered your question, please mark as answer.
If my post was helpful, please mark as helpful.


