none
How does U-SQL do the query similar to T-SQL "WHERE MyField in SELECT * ..."

    Question

  • @source:

    |CallID|Caller|Called|
    |1     |aa    |bb    |
    |1     |      |234   |
    |1     |567   |888   |
    |2     |cc    |ee    |
    |2     |      |dd    |
    |2     |555   |888   |
    |3     |99    |bb    |
    |3     |      |234   |
    |3     |666   |888   |
    |4     |aa    |bb    |
    |4     |      |234   |
    |4     |567   |888   |
    |5     |33    |bb    |
    |5     |      |234   |
    |5     |567   |888   |

    It needs to get all the records (based on the CallID field) who has the Caller field value equal to "aa".

    Output:

    |CallID|Caller|Called|
    |1     |aa    |bb    |
    |1     |      |234   |
    |1     |567   |888   |
    |4     |aa    |bb    |
    |4     |      |234   |
    |4     |567   |888   |


    I am thinking about U-SQL could be something like:

    @callids= SELECT CallID FROM @source
    WHERE CallerID=="aa";


    @result= SELECT * FROM @source
    WHERE CallID in @callIds;

    In the above statement, the "WHERE CallID in @callIds" does not looks like a valid U-SQL. So how to implement this query correctly?

    Thanks.

    Roger

    Wednesday, January 11, 2017 1:35 AM

Answers

  • That's basically a self-join:

    @input = SELECT * FROM (VALUES
     (1     ,"aa","bb"),
     (1     ,(string)null,"234"),
     (1     ,"567","888"),
     (2     ,"cc","ee"),
     (2     ,(string)null,"dd"),
     (2     ,"555","888"),
     (3     ,"99","bb"),
     (3     ,(string)null,"234"),
     (3     ,"666","888"),
     (4     ,"aa","bb"),
     (4     ,(string)null,"234"),
     (4     ,"567","888"),
     (5     ,"33","bb"),
     (5     ,(string)null,"234"),
     (5     ,"567","888")) AS T(CallID,Caller,Called);
    
    @aa_callids =
    SELECT DISTINCT CallID
    FROM @input
    WHERE Caller == "aa";
    
    @result =
    SELECT @input.*
    FROM @input
         JOIN
             @aa_callids
         ON @input.CallID == @aa_callids.CallID;
    
    OUTPUT @result
    TO "/output/msdn.csv"
    USING Outputters.Csv();
    


    Michael Rys

    • Marked as answer by Roger Rong NZ Wednesday, January 11, 2017 2:29 AM
    Wednesday, January 11, 2017 2:08 AM
    Moderator

All replies

  • That's basically a self-join:

    @input = SELECT * FROM (VALUES
     (1     ,"aa","bb"),
     (1     ,(string)null,"234"),
     (1     ,"567","888"),
     (2     ,"cc","ee"),
     (2     ,(string)null,"dd"),
     (2     ,"555","888"),
     (3     ,"99","bb"),
     (3     ,(string)null,"234"),
     (3     ,"666","888"),
     (4     ,"aa","bb"),
     (4     ,(string)null,"234"),
     (4     ,"567","888"),
     (5     ,"33","bb"),
     (5     ,(string)null,"234"),
     (5     ,"567","888")) AS T(CallID,Caller,Called);
    
    @aa_callids =
    SELECT DISTINCT CallID
    FROM @input
    WHERE Caller == "aa";
    
    @result =
    SELECT @input.*
    FROM @input
         JOIN
             @aa_callids
         ON @input.CallID == @aa_callids.CallID;
    
    OUTPUT @result
    TO "/output/msdn.csv"
    USING Outputters.Csv();
    


    Michael Rys

    • Marked as answer by Roger Rong NZ Wednesday, January 11, 2017 2:29 AM
    Wednesday, January 11, 2017 2:08 AM
    Moderator
  • Thanks Michael. That is exactly I am looking after.

    Roger

    Wednesday, January 11, 2017 2:35 AM