none
matching up values to data in ranges

    Question

  • Hi,

    I am working on a problem that involves data from auction sales of live cattle.  The cattle market reports I'm working with give ranges of weights for a given sale date but these ranges are not always the same for every sale date (see sample data in code below).  The goal is to match up cattle of standard weights (100 lb cattle, 200 lb cattle, ...500 lb cattle, ...1000 lb cattle) with the weight range for any given sale date.  If there were no cattle sold on a particular day for any of the standard weights then those standard weights just don't show up.  Only standard weights that fall within a weight range are to be returned in the results.  I have been able to get this to work using a cross join but in reality the @SaleWeightRanges table will hold a lot of data and it would be best to do this as efficiently as possible.  Does anyone have any suggestions about how to improve the efficiency?

    declare @StandardWeights table (Weight int);
     
    insert into @StandardWeights (Weight) values 
    (100), (200), (300), (400), (500), (600), (700), (800), (900), (1000);
    
    declare @SaleWeightRanges table (SaleDate date, LowWeight int, HighWeight int); 
    insert into @SaleWeightRanges (SaleDate, LowWeight, HighWeight) values
    ('5/25/2012', 225, 325), ('5/25/2012', 350, 450), ('5/25/2012', 475, 575), ('5/25/2012', 625, 700),
    ('6/8/2012', 200, 325), ('6/8/2012', 350, 475), ('6/8/2012', 525, 650), ('6/8/2012', 675, 775);
    
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    
    select *--, case when Weight between LowWeight and HighWeight then 'x' end 
    from @SaleWeightRanges 
    cross join @StandardWeights 
    where Weight between LowWeight and HighWeight
    order by SaleDate, LowWeight;

    Wednesday, June 13, 2012 12:08 AM

Answers

  • Thank-you Naomi!  Your solution worked with a minor change - I added R.SaleDate to the PARTITION BY (see below) and that yielded the expected 10 row result set as shown in the print screen from my 6/20/12 1:32 AM post.

    DECLARE @StandardWeights TABLE (Weight INT);
    
    INSERT INTO @StandardWeights (Weight)
    VALUES (100)
    	,(200)
    	,(300)
    	,(400)
    	,(500)
    	,(600)
    	,(700)
    	,(800);
    
    DECLARE @SaleWeightRanges TABLE (
    	SaleDate DATE
    	,LowWeight INT
    	,HighWeight INT
    	);
    
    INSERT INTO @SaleWeightRanges (
    	SaleDate
    	,LowWeight
    	,HighWeight
    	)
    VALUES (
    	'5/25/2012'
    	,225
    	,325
    	)
    	,(
    	'5/25/2012'
    	,350
    	,450
    	)
    	,(
    	'5/25/2012'
    	,475
    	,575
    	)
    	,(
    	'5/25/2012'
    	,625
    	,700
    	)
    	,(
    	'5/25/2012'
    	,700
    	,800
    	)
    	,--added 1 range with a LowWeight value that's the same as a HighWeight value for another range
    	(
    	'6/8/2012'
    	,200
    	,325
    	)
    	,(
    	'6/8/2012'
    	,350
    	,475
    	)
    	,(
    	'6/8/2012'
    	,525
    	,650
    	)
    	,(
    	'6/8/2012'
    	,675
    	,775
    	);
    
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    
    WITH cte
    AS (
    	SELECT *
    		,--, case when Weight between LowWeight and HighWeight then 'x' end 
    		ROW_NUMBER() OVER (
    			PARTITION BY R.SaleDate, W.Weight ORDER BY R.LowWeight    --added R.SaleDate to PARTITION BY
    				,R.HighWeight
    			) AS Rn
    	FROM @SaleWeightRanges R
    	INNER JOIN @StandardWeights W ON W.Weight BETWEEN R.LowWeight
    			AND R.HighWeight
    	)
    SELECT *
    FROM cte
    WHERE Rn = 1
    ORDER BY SaleDate
    	,LowWeight;
    GO  


    • Marked as answer by Knot Wednesday, June 20, 2012 11:12 PM
    • Edited by Knot Thursday, June 21, 2012 2:30 AM
    Wednesday, June 20, 2012 11:12 PM

All replies

  • Hello Knot,

    If I got you right, then this should work: Changing the cross to a left join and bringing the condition into the join

    DEClare @StandardWeights table (Weight int);
     
    insert into @StandardWeights (Weight) values 
    (100), (200), (300), (400), (500), (600), (700), (800), (900), (1000);
    
    declare @SaleWeightRanges table (SaleDate date, LowWeight int, HighWeight int); 
    insert into @SaleWeightRanges (SaleDate, LowWeight, HighWeight) values
    ('5/25/2012', 225, 325), ('5/25/2012', 350, 450), ('5/25/2012', 475, 575), ('5/25/2012', 625, 700),
    ('6/8/2012', 200, 325), ('6/8/2012', 350, 475), ('6/8/2012', 525, 650), ('6/8/2012', 675, 775);
    
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    
    select *--, case when Weight between LowWeight and HighWeight then 'x' end 
    from @StandardWeights  
         LEFT JOIN @SaleWeightRanges
             ON Weight between LowWeight and HighWeight
    order by SaleDate, LowWeight;


    Olaf Helper
    * cogito ergo sum * errare humanum est * quote erat demonstrandum *
    Wenn ich denke, ist das ein Fehler und das beweise ich täglich
    Blog Xing

    Wednesday, June 13, 2012 3:13 AM
  • Yes Olaf, that works!  I just made one small change to it and that was to add a WHERE statement as follows: WHERE LowWeight is not null.  That prevented rows for Weight that had no corresponding LowWeight+HighWeight matches from appearing.

    A couple of follow up questions with respect to the LEFT JOIN approach:

    1. How much more efficient is using this approach compared to the CROSS JOIN method?

    2. I didn't know it was possible to use anything other than an equals sign in the ON clause of a join.  Are there any other things that can be used in ON besides the equals sign and BETWEEN?  If so, what are they?

    Wednesday, June 13, 2012 11:47 PM
  • Simply use an INNER JOIN instead of CROSS JOIN or LEFT JOIN. Yes, the normal JOINs can be used with the range.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    • Proposed as answer by Hasham NiazEditor Thursday, June 14, 2012 6:03 AM
    • Marked as answer by Knot Friday, June 15, 2012 12:45 AM
    • Unmarked as answer by Knot Friday, June 15, 2012 2:31 PM
    Thursday, June 14, 2012 3:41 AM
    Moderator
  • The between method works great when the sale weight ranges don't overlap but causes (legitimate) duplication when there is a value that's a HighWeight in one range and a LowWeight in the next range.  How can this code be modified so that a Weight only falls into the lowest sale weight range?  See sample code below:

    declare @StandardWeights table (Weight int);
     
    insert into @StandardWeights (Weight) values 
    (100), (200), (300), (400), (500), (600), (700), (800);
    declare @SaleWeightRanges table (SaleDate date, LowWeight int, HighWeight int); 
    insert into @SaleWeightRanges (SaleDate, LowWeight, HighWeight) values
    ('5/25/2012', 225, 325), ('5/25/2012', 350, 450), ('5/25/2012', 475, 575), ('5/25/2012', 625, 700), ('5/25/2012', 700, 800),  --added 1 range with a LowWeight value that's the same as a HighWeight value for another range
    ('6/8/2012', 200, 325), ('6/8/2012', 350, 475), ('6/8/2012', 525, 650), ('6/8/2012', 675, 775);
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    select *--, case when Weight between LowWeight and HighWeight then 'x' end 
    from @SaleWeightRanges 
    inner join @StandardWeights 
    on Weight between LowWeight and HighWeight    --option A
    --on Weight >= LowWeight and Weight < HighWeight  --option B
    order by SaleDate, LowWeight;
    /*
    problems:
    option A: Weight = 700 is duplicated because it falls into two different sale weight ranges, want it to fall into lowest range only
    option B: Weight = 800 row doesn't show up
    */

    Here are the results:

    Row 5 needs to be removed in order to get the correct results.



    • Edited by Knot Wednesday, June 20, 2012 1:36 AM
    Friday, June 15, 2012 2:34 PM
  • Try using CROSS APPLY approach, e.g.

    SELECT W.*
    	,R.* --, case when Weight between LowWeight and HighWeight then 'x' end 
    FROM @StandardWeights W
    CROSS APPLY (
    	SELECT TOP (1) *
    	FROM @SaleWeightRanges R
    	WHERE W.Weight BETWEEN R.LowWeight
    			AND R.HighWeight --option A
    	ORDER BY R.LowWeight
    	) R
    --on Weight >= LowWeight and Weight < HighWeight  --option B
    ORDER BY R.SaleDate
    	,R.LowWeight;


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    Sunday, June 17, 2012 3:26 AM
    Moderator
  • Hi Naomi,

    This is the result I get using the cross apply approach above:

    ...it peeled out 3 rows that should still show up in the final result.  Here is the desired result (obtained by hard coding out the extra row just to get a screenshot):

    There are no SaleDate+Weight combinations that're repeated in the desired result. I am stuck on this one.




    • Edited by Knot Wednesday, June 20, 2012 1:38 AM
    Wednesday, June 20, 2012 1:32 AM
  • Try

    DECLARE @StandardWeights TABLE (Weight INT);
    
    INSERT INTO @StandardWeights (Weight)
    VALUES (100)
    	,(200)
    	,(300)
    	,(400)
    	,(500)
    	,(600)
    	,(700)
    	,(800);
    
    DECLARE @SaleWeightRanges TABLE (
    	SaleDate DATE
    	,LowWeight INT
    	,HighWeight INT
    	);
    
    INSERT INTO @SaleWeightRanges (
    	SaleDate
    	,LowWeight
    	,HighWeight
    	)
    VALUES (
    	'5/25/2012'
    	,225
    	,325
    	)
    	,(
    	'5/25/2012'
    	,350
    	,450
    	)
    	,(
    	'5/25/2012'
    	,475
    	,575
    	)
    	,(
    	'5/25/2012'
    	,625
    	,700
    	)
    	,(
    	'5/25/2012'
    	,700
    	,800
    	)
    	,--added 1 range with a LowWeight value that's the same as a HighWeight value for another range
    	(
    	'6/8/2012'
    	,200
    	,325
    	)
    	,(
    	'6/8/2012'
    	,350
    	,475
    	)
    	,(
    	'6/8/2012'
    	,525
    	,650
    	)
    	,(
    	'6/8/2012'
    	,675
    	,775
    	);
    
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    WITH cte
    AS (
    	SELECT *
    		,--, case when Weight between LowWeight and HighWeight then 'x' end 
    		ROW_NUMBER() OVER (
    			PARTITION BY W.Weight ORDER BY R.LowWeight
    				,R.HighWeight
    			) AS Rn
    	FROM @SaleWeightRanges R
    	INNER JOIN @StandardWeights W ON W.Weight BETWEEN R.LowWeight
    			AND R.HighWeight
    	)
    SELECT *
    FROM cte
    WHERE Rn = 1
    ORDER BY SaleDate
    	,LowWeight;
    


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    Wednesday, June 20, 2012 3:14 AM
    Moderator
  • Thank-you Naomi!  Your solution worked with a minor change - I added R.SaleDate to the PARTITION BY (see below) and that yielded the expected 10 row result set as shown in the print screen from my 6/20/12 1:32 AM post.

    DECLARE @StandardWeights TABLE (Weight INT);
    
    INSERT INTO @StandardWeights (Weight)
    VALUES (100)
    	,(200)
    	,(300)
    	,(400)
    	,(500)
    	,(600)
    	,(700)
    	,(800);
    
    DECLARE @SaleWeightRanges TABLE (
    	SaleDate DATE
    	,LowWeight INT
    	,HighWeight INT
    	);
    
    INSERT INTO @SaleWeightRanges (
    	SaleDate
    	,LowWeight
    	,HighWeight
    	)
    VALUES (
    	'5/25/2012'
    	,225
    	,325
    	)
    	,(
    	'5/25/2012'
    	,350
    	,450
    	)
    	,(
    	'5/25/2012'
    	,475
    	,575
    	)
    	,(
    	'5/25/2012'
    	,625
    	,700
    	)
    	,(
    	'5/25/2012'
    	,700
    	,800
    	)
    	,--added 1 range with a LowWeight value that's the same as a HighWeight value for another range
    	(
    	'6/8/2012'
    	,200
    	,325
    	)
    	,(
    	'6/8/2012'
    	,350
    	,475
    	)
    	,(
    	'6/8/2012'
    	,525
    	,650
    	)
    	,(
    	'6/8/2012'
    	,675
    	,775
    	);
    
    --select * from @StandardWeights; 
    --select * from @SaleWeightRanges;
    
    WITH cte
    AS (
    	SELECT *
    		,--, case when Weight between LowWeight and HighWeight then 'x' end 
    		ROW_NUMBER() OVER (
    			PARTITION BY R.SaleDate, W.Weight ORDER BY R.LowWeight    --added R.SaleDate to PARTITION BY
    				,R.HighWeight
    			) AS Rn
    	FROM @SaleWeightRanges R
    	INNER JOIN @StandardWeights W ON W.Weight BETWEEN R.LowWeight
    			AND R.HighWeight
    	)
    SELECT *
    FROM cte
    WHERE Rn = 1
    ORDER BY SaleDate
    	,LowWeight;
    GO  


    • Marked as answer by Knot Wednesday, June 20, 2012 11:12 PM
    • Edited by Knot Thursday, June 21, 2012 2:30 AM
    Wednesday, June 20, 2012 11:12 PM
  • Is Naomi's reply also an answer?

    Thanks!


    Ed Price (a.k.a User Ed), SQL Server Experience Program Manager (Blog, Twitter, Wiki)

    Thursday, June 21, 2012 1:01 AM
    Owner
  • Almost.  It did not yield the expected result set which is why I posted the final solution and marked Naomi's answer as Helpful.  I hope that is not considered bad forum ettiquette.  I have always assumed that marking the post that yields the exact result should be marked as the answer so that future readers can quickly identify what worked for the problem. 

    Without Naomi's extremely valuable post, this would still be broken.  If the system would allow a user to mark the Helpful button more than once per post, I would certainly mark that one many times!


    • Edited by Knot Thursday, June 21, 2012 2:33 AM
    Thursday, June 21, 2012 2:24 AM