none
Joining two t-sql queries into one result set

    Question

  • Hi all

    I have two queries here

    SELECT

    [PROCESS-DT] , SUM ( TOTAL ) AS TotalInboundTransactionsPerDay FROM dbo . Vendor_production_monitoring

    GROUP

    BY [PROCESS-DT]

    ORDER

    BY RIGHT( [PROCESS-DT] , 5 ) DESC
     
     ---------------------------------------------------------------------------
     
    SELECT
    RECEIVE , COUNT ( RECEIVE ) AS  Suspendedrecords FROM dbo . tbl_Daily_Suspense

    GROUP

    BY RECEIVE
    ORDER
    BY RIGHT( RECEIVE , 5 ) DESC.


    -------------------------------------------------------------------------------------------

    The out put for these two queries are as follows

    PROCESS-DT  InboundRecords
    2009-12-29      723054
    2009-12-28      638061
    2009-12-20      287479
    2009-12-19      493505
    2009-12-18      750736
    2009-12-17      822299
    2009-12-16      630947
    2009-12-15      662530
     
     
    RECEIVE          Suspensed Records
    20091228               3288
    20091227               2503
    20091226               1191
    20091225               2275
    20091224               2018
    20091223               2394
    20091222               2321
    20091221               1655
    20091220               184
    20091219               1815
     
     
    Above two are the outputs from two queries. I need another column which is Transaction Yeild %age. that is
     
    (Inbound records - Suspended Records) / Inbound records for last 30 days.

    And i dont have any common column in two tables to join.

    l
    So now i need a query which displays all the four columns in one result set and a 5th column calculating the Transaction Yeild %age


    Thanks
    Sai
    Thursday, December 31, 2009 5:12 AM

Answers

  • <<So now i need a query which displays all the four columns in one result set and a 5th column calculating the Transaction Yeild %age

    How can you calculate yield on unrelated data?  The answer is you cant and should not.  The result of the yield calculation will be completely bogus and wrong.  As for the row_number() recommendations, I highly recommend that you do not use row_number for this.  Row_Number is going to be a hack to solving this problem because you have not properly identified the business rules.  Row_Number with an order by of select 1 can never gaurantee the order of the data because a constant value is used, thus you always risk incorrect joins, which mean incorrect calculations.  You need to have a column to correlate the data; otherwise, you cannot calculate yield.

    With all of that said, it seems to me the common factor here is Date.  You have a process and a recieve date. You need to join the two tables on the date column and then the yield makes some sense.


    DECLARE @t1 TABLE(
    PROCESS_DT datetime,
    InboundRecords int
    );
    
    
    INSERT INTO @t1 VALUES ('2009-12-29',723054);
    INSERT INTO @t1 VALUES ('2009-12-28',638061);
    INSERT INTO @t1 VALUES ('2009-12-20',287479);
    INSERT INTO @t1 VALUES ('2009-12-19',493505);
    INSERT INTO @t1 VALUES ('2009-12-18',750736);
    INSERT INTO @t1 VALUES ('2009-12-17',822299);
    INSERT INTO @t1 VALUES ('2009-12-16',630947);
    INSERT INTO @t1 VALUES ('2009-12-15',662530);
     
    DECLARE @t2 TABLE(
    RECEIVE_dt DATETIME,
    SuspendedRecords INT
    );
    
    INSERT INTO @t2 VALUES ('20091228',3288);
    INSERT INTO @t2 VALUES ('20091227',2503);
    INSERT INTO @t2 VALUES ('20091226',1191);
    INSERT INTO @t2 VALUES ('20091225',2275);
    INSERT INTO @t2 VALUES ('20091224',2018);
    INSERT INTO @t2 VALUES ('20091223',2394);
    INSERT INTO @t2 VALUES ('20091222',2321);
    INSERT INTO @t2 VALUES ('20091221',1655);
    INSERT INTO @t2 VALUES ('20091220',184);
    INSERT INTO @t2 VALUES ('20091219',1815);
    
    SELECT *,(1.0*(Inboundrecords - SuspendedRecords)) / Inboundrecords 
    FROM @t1 t1
    INNER JOIN @t2 t2
    ON t1.[PROCESS_DT] = t2.receive_dt
    --WHERE some date filters

    http://jahaines.blogspot.com/
    • Marked as answer by Zongqing Li Wednesday, January 06, 2010 8:29 AM
    Thursday, December 31, 2009 4:28 PM
    Moderator

All replies

  • If there is no relationship between these two tables, then how do you expect the result look like? Can you provide an expected result with the sample you posted ?
       Or the Process Date= Receive Date?
    • Edited by sanoj av Thursday, December 31, 2009 5:24 AM
    Thursday, December 31, 2009 5:22 AM
  • The tables don't have to be related. You can do the following. Let me know if I'm missing anything.

    SELECT [PROCESS-DT]
    		,SUM(TOTAL) AS TotalInboundTransactionsPerDay
    		,RECEIVE 
    		,COUNT(RECEIVE) AS  Suspendedrecords 
    		,SUM(TOTAL) - COUNT(RECEIVE) / NULLIF(SUM(TOTAL),0) AS [ransaction Yeild]
    FROM dbo.Vendor_production_monitoring,dbo.tbl_Daily_Suspense
    GROUP BY [PROCESS-DT],RECEIVE
    ORDER BY RIGHT([PROCESS-DT],5) DESC


    Abdallah El-Chal, PMP, ITIL, MCTS
    Thursday, December 31, 2009 5:23 AM
  • Are Process-DT and Receive related somehow? They both look like dates, may be we need to join based on these two fields? If not, in SQL Server 2005 and up you may want to introduce a new field RowNum on the fly, e.g.
    ;with cte1 as (SELECT
    
    [PROCESS-DT] , SUM ( TOTAL ) AS TotalInboundTransactionsPerDay 
    FROM dbo . Vendor_production_monitoring GROUP
    
    BY [PROCESS-DT] , 
     
    cte2 as (
    SELECT
    RECEIVE , COUNT ( RECEIVE ) AS  SuspendedRecords 
    FROM dbo . tbl_Daily_Suspense GROUP
    
    BY RECEIVE ,
    
    cte3 as (select *, row_number() over (order by [Process-DT] DESC) as RowNum from cte1),
    cte4 as (select *, row_number() over(order by [Receive] DESC) as RowNum from cte2)
    
    select c1.*,
    (TotalInboundTransactionsPerDay- SuspendedRecords) 
    / NULLIF(TotalInboundTransactionsPerDay,0) 
    from cte3 c1 inner join cte4 c2 on c1.RowNum = c2.RowNum


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Thursday, December 31, 2009 5:27 AM
    Moderator
  • hi Abdshall

    thanks for ur response.
    But running this query will give me the cartesian product.?
    Thursday, December 31, 2009 5:52 AM
  • Thanks for ur response.

    I want to use order by in my query but cte will not allow order by clause.

    so how can i do it?
    Thursday, December 31, 2009 5:54 AM
  • CTE will let you use ORDER BY if you include TOP (100) with it.
    Abdallah El-Chal, PMP, ITIL, MCTS
    Thursday, December 31, 2009 5:55 AM
  • By using row_number() technique you're ordering the way you want. You can also order by your final select (using CTEs)

    See also the minor fixes I applied to my original query.


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Thursday, December 31, 2009 5:55 AM
    Moderator
  • If you don't want the cartesion product, you have to limit the rows being returned somehow so they match. One way to do that, as Naomi suggested, is by using a ranking function. Check out this example.

    DECLARE @T TABLE (ID INT,V VARCHAR(20))
    INSERT INTO @T 
    SELECT 1,'TEST1' UNION ALL SELECT 3,'TEST3' UNION ALL SELECT 5,'TEST5'
    DECLARE @U TABLE (I INT,M VARCHAR(20))
    INSERT INTO @U 
    SELECT 2,'TEST2' UNION ALL SELECT 4,'TEST4' UNION ALL SELECT 6,'TEST6'
    
    ;WITH CTE AS
    (
    SELECT TOP(100) SUM(ID) AS S
    		,V,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS R1
    FROM @T
    GROUP BY V
    ORDER BY S
    ),CTE2 AS
    (
    SELECT TOP(100) COUNT(I) AS C
    		,M,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS R2
    FROM @U
    GROUP BY M
    ORDER BY C
    )
    SELECT S,V,C,M
    FROM CTE JOIN 
    CTE2 ON CTE.R1 = CTE2.R2


    Abdallah El-Chal, PMP, ITIL, MCTS
    Thursday, December 31, 2009 6:08 AM
  • Thanks abdallah.
    Can u tell me how this works with my code.
    can u script it for me?
    Thursday, December 31, 2009 4:07 PM
  • Try the following. Please test it as I don't have any data to test on.

    ;WITH CTE As
    (
    SELECT TOP(100) SUM(TOTAL) AS TotalInboundTransactionsPerDay
    			,[PROCESS-DT],ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS R1
    FROM dbo . Vendor_production_monitoring
    GROUP BY [PROCESS-DT]
    ORDER BY RIGHT( [PROCESS-DT] , 5 ) DESC
    ),CTE2 AS
    ( 
    SELECT TOP(100) COUNT(RECEIVE) AS  Suspendedrecords
    		,RECEIVE,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS R2
    FROM dbo.tbl_Daily_Suspense
    GROUP BY RECEIVE
    ORDER BY RIGHT( RECEIVE , 5 ) DESC
    )
    SELECT TotalInboundTransactionsPerDay
    		,[PROCESS-DT]
    		,Suspendedrecords
    		,RECEIVE
    		,TotalInboundTransactionsPerDay
    				-
    		Suspendedrecords
    				/
    		NULLIF(TotalInboundTransactionsPerDay,0) AS [Transaction Yeild]
    FROM CTE
    	JOIN CTE2 
    		ON CTE.R1 = CTE2.R2


    Abdallah El-Chal, PMP, ITIL, MCTS
    Thursday, December 31, 2009 4:17 PM
  • <<So now i need a query which displays all the four columns in one result set and a 5th column calculating the Transaction Yeild %age

    How can you calculate yield on unrelated data?  The answer is you cant and should not.  The result of the yield calculation will be completely bogus and wrong.  As for the row_number() recommendations, I highly recommend that you do not use row_number for this.  Row_Number is going to be a hack to solving this problem because you have not properly identified the business rules.  Row_Number with an order by of select 1 can never gaurantee the order of the data because a constant value is used, thus you always risk incorrect joins, which mean incorrect calculations.  You need to have a column to correlate the data; otherwise, you cannot calculate yield.

    With all of that said, it seems to me the common factor here is Date.  You have a process and a recieve date. You need to join the two tables on the date column and then the yield makes some sense.


    DECLARE @t1 TABLE(
    PROCESS_DT datetime,
    InboundRecords int
    );
    
    
    INSERT INTO @t1 VALUES ('2009-12-29',723054);
    INSERT INTO @t1 VALUES ('2009-12-28',638061);
    INSERT INTO @t1 VALUES ('2009-12-20',287479);
    INSERT INTO @t1 VALUES ('2009-12-19',493505);
    INSERT INTO @t1 VALUES ('2009-12-18',750736);
    INSERT INTO @t1 VALUES ('2009-12-17',822299);
    INSERT INTO @t1 VALUES ('2009-12-16',630947);
    INSERT INTO @t1 VALUES ('2009-12-15',662530);
     
    DECLARE @t2 TABLE(
    RECEIVE_dt DATETIME,
    SuspendedRecords INT
    );
    
    INSERT INTO @t2 VALUES ('20091228',3288);
    INSERT INTO @t2 VALUES ('20091227',2503);
    INSERT INTO @t2 VALUES ('20091226',1191);
    INSERT INTO @t2 VALUES ('20091225',2275);
    INSERT INTO @t2 VALUES ('20091224',2018);
    INSERT INTO @t2 VALUES ('20091223',2394);
    INSERT INTO @t2 VALUES ('20091222',2321);
    INSERT INTO @t2 VALUES ('20091221',1655);
    INSERT INTO @t2 VALUES ('20091220',184);
    INSERT INTO @t2 VALUES ('20091219',1815);
    
    SELECT *,(1.0*(Inboundrecords - SuspendedRecords)) / Inboundrecords 
    FROM @t1 t1
    INNER JOIN @t2 t2
    ON t1.[PROCESS_DT] = t2.receive_dt
    --WHERE some date filters

    http://jahaines.blogspot.com/
    • Marked as answer by Zongqing Li Wednesday, January 06, 2010 8:29 AM
    Thursday, December 31, 2009 4:28 PM
    Moderator
  • Mohanv,

    Adam is correct. I just added the calculation in the SELECT outside the CTE's but I didn't really think about the join. The ROW_NUMBER() will not guarantee that both result sets will be in the same order. You can use the ROW_NUMBER() if you just want to list the data from both result set, but don't use it to calculate the yield.



    Abdallah El-Chal, PMP, ITIL, MCTS
    Thursday, December 31, 2009 4:33 PM
  • I agree that the join based on date fields makes much more sense. I also suggested it in my original message.
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Thursday, December 31, 2009 5:57 PM
    Moderator