none
Recursive Update - Row by Row RRS feed

  • Question

  • Good evening [ed. This is all fake data, but have fun],

    I have a "Fact" table in a star schema with all possible combinations of Lottery wins. We choose 7 numbers from 1 to 49 for each draw, any day, any second of the day, and assign it to a spot with a Gold Mining Metric.

    There are 85,900,543 rows. I am experimenting with Analysis Services. My cube is great, but it is aggregating because the data  generation is creating 60 to 80 duplicate dimension keys per second per metric per day.

    Yes, I have duplicate dimension values, and the Cube aggregates the Lottery Numbers from the Granularity of the source data up to the granularity of the duplicates. I use a Hub and Spoke model (Star Schema). The Hub is the fact table, named Hub.Permutations. Each Row has distinct Lottery Numbers. The lowest pick is 1, 2, 3, 4, 5, 6, 7. The highest pick is 43, 44, 45, 46, 47, 48, 49. The Permute column is an Identity. The bottom 3 columns are Foreign Surrogate Key columns (FSK = PSK). Each Primary Surrogate Key column in the dimensions points to many FSK columns in Hub.Permutations. It is a classic Star Schema.

    Here is the Hub (Fact Table with 85,900,543 Rows, one row for each Lottery Pick):

    Next, I have a TimeClock Dimension (with "seconds" granularity and 86,400 seconds) ...


    Then, I have the Spoke.Metrics Dimension (319 rows, 1 per Metric).

    Next, I have a Calendar Dimension with 10 years at daily grain.

    The idea is that each number could be drawn any second of any day, and it has an assigned metric.

    Here is the SQL. I am trying to update the table based on some logic, but the ROW_NUMBER Function is "sticky".  The testing is showing that only the first Row Number value (1) affects the partition. Each partition gets incremented by 1 on the Time_FSK column, but I want each row to respect each ROW_NUMBER value, rather than just the first one.

    WITH FixTime AS (
    SELECT
    [Permute],
    [Calendar_FSK], 
    [Metric_FSK],
    [Time_FSK],
    'Row Number' = ROW_NUMBER() OVER(PARTITION BY [Calendar_FSK],[Metric_FSK], [Time_FSK] ORDER BY [Calendar_FSK])
    FROM [Hub].[Permutations]
    WHERE
    [Time_FSK] BETWEEN 5010 AND 5200
    AND [Calendar_FSK] BETWEEN -2147483647 AND -2147483602

    UPDATE [Hub].[Permutations] 
    SET Time_FSK = P.Time_FSK + F.[Row Number]
    FROM [Hub].[Permutations] AS P 
    JOIN
    FixTime AS F
    ON 
    P.[Calendar_FSK] = F.[Calendar_FSK] 
    AND 
    P.[Metric_FSK] = F.[Metric_FSK]
    AND
    P.[Time_FSK] = F.[Time_FSK]
    WHERE
    P.[Time_FSK] BETWEEN 5010 AND 5200
    AND P.[Calendar_FSK] BETWEEN -2147483647 AND -2147483602

    Here is the data. I want each time to be updated by each Row Number.


    Permute      Calendar_FSK   Metric_FSK Time_FSK Row Number
    -2090619066 -2147483647 10 5012 1
    -2109224228 -2147483647 10 5012 2
    -2106690991 -2147483647 10 5012 3
    -2110890423 -2147483647 10 5012 4
    -2068975223 -2147483647 10 5012 5
    -2112917492 -2147483647 10 5012 6
    -2110324616 -2147483647 10 5012 7
    -2092706095 -2147483647 10 5012 8
    -2087369840 -2147483647 10 5012 9
    -2122372847 -2147483647 10 5012 10
    -2069207744 -2147483647 10 5012 11
    -2081277919 -2147483647 10 5012 12
    -2076468099 -2147483647 10 5012 13
    -2110348468 -2147483647 10 5012 14
    -2070314042 -2147483647 10 5012 15
    -2127986483 -2147483647 10 5012 16
    -2132392713 -2147483647 10 5012 17
    -2091337229 -2147483647 10 5012 18
    -2141251443 -2147483647 10 5012 19
    -2101432165 -2147483647 10 5012 20
    -2112749084 -2147483647 10 5012 21
    -2088757620 -2147483647 10 5012 22
    -2139358184 -2147483647 10 5012 23
    -2118370295 -2147483647 10 5012 24
    -2143730532 -2147483647 10 5012 25
    -2104680913 -2147483647 10 5012 26
    -2089132330 -2147483647 10 5012 27
    -2082777692 -2147483647 10 5012 28
    -2139256053 -2147483647 10 5012 29
    -2117584730 -2147483647 10 5012 30
    -2119958905 -2147483647 10 5012 31
    -2134400071 -2147483647 10 5012 32
    -2107368220 -2147483647 10 5012 33
    -2120400787 -2147483647 10 5012 34
    -2115274576 -2147483647 10 5012 35
    -2064250120 -2147483647 10 5012 36
    -2141838668 -2147483647 10 5012 37
    -2095578889 -2147483647 10 5012 38
    -2105025889 -2147483647 10 5012 39
    -2080119332 -2147483647 10 5012 40
    -2118165718 -2147483647 10 5012 41
    -2131808198 -2147483647 10 5012 42
    -2140006191 -2147483647 10 5012 43
    -2091863071 -2147483647 10 5012 44
    -2094947858 -2147483647 10 5012 45
    -2080571596 -2147483647 10 5012 46
    -2078309387 -2147483647 10 5012 47
    -2136725133 -2147483647 10 5012 48
    -2102437789 -2147483647 10 5012 49
    -2101888096 -2147483647 10 5012 50
    -2077482061 -2147483647 10 5012 51
    -2112027211 -2147483647 10 5012 52
    -2106718270 -2147483647 11 5013 1
    -2079344346 -2147483647 11 5013 2
    -2071243675 -2147483647 11 5013 3
    -2145066527 -2147483647 11 5013 4
    -2130974043 -2147483647 11 5013 5
    -2100557615 -2147483647 11 5013 6
    -2105931802 -2147483647 11 5013 7
    -2128016641 -2147483647 11 5013 8
    -2140552570 -2147483647 11 5013 9
    -2145432615 -2147483647 11 5013 10
    -2117546460 -2147483647 11 5013 11
    -2140662668 -2147483647 11 5013 12
    -2125325867 -2147483647 11 5013 13
    -2065937021 -2147483647 11 5013 14


    • Edited by CubeSparkle Thursday, November 30, 2017 3:19 AM
    • Moved by ArthurZ Thursday, November 30, 2017 3:52 AM Not SSIS related
    Thursday, November 30, 2017 3:19 AM

All replies

  • Here is the SQL. I am trying to update the table based on some logic, but the ROW_NUMBER Function is "sticky".  The testing is showing that only the first Row Number value (1) affects the partition. Each partition gets incremented by 1 on the Time_FSK column, but I want each row to respect each ROW_NUMBER value, rather than just the first one.

    Hi CubeSparkle,

    Per your detailed description, I guess you possibly want this.

     drop table if exists [Permutations]
    
     create table [Permutations]
     (
      Permute int,
      Calendar_FSK int,
      Metric_FSK int,
      Time_FSK int
     )
    GO
    
     insert into [Permutations] values
     (-2090619066,-2147483647,10,5012),     
     (-2109224228,-2147483647,10,5012),    
     (-2106690991,-2147483647,10,5012),
     (-2110890423,-2147483647,10,5012),      
     (-2068975223,-2147483647,10,5012),     
     (-2112917492,-2147483647,10,5012),  
     (-2110324616,-2147483647,10,5012),
     (-2106718270,-2147483647,11,5013), 
     (-2079344346,-2147483647,11,5013), 
     (-2071243675,-2147483647,11,5013), 
     (-2145066527,-2147483647,11,5013), 
     (-2130974043,-2147483647,11,5013),
     (-2100557615,-2147483647,11,5013),
     (-2105931802,-2147483647,11,5013)
    
     ;WITH FixTime AS (
     SELECT
    [Permute],
    [Calendar_FSK], 
    [Metric_FSK],
    [Time_FSK],
    'Row Number' = ROW_NUMBER() OVER(PARTITION BY [Calendar_FSK],[Metric_FSK], [Time_FSK] ORDER BY [Calendar_FSK])
    from [Permutations]
    --FROM [Hub].[Permutations]
    --WHERE[Time_FSK] BETWEEN 5010 AND 5200
    --AND [Calendar_FSK] BETWEEN -2147483647 AND -2147483602
     ) 
    --UPDATE [Hub].[Permutations] 
    UPDATE FixTime 
    SET Time_FSK = Time_FSK + [Row Number]
    --FROM [Permutations] AS P 
    --JOIN FixTime AS F ON P.[Calendar_FSK] = F.[Calendar_FSK] 
    --                 AND P.[Metric_FSK] = F.[Metric_FSK]
    --                 AND P.[Time_FSK] = F.[Time_FSK]
    --WHERE P.[Time_FSK] BETWEEN 5010 AND 5200
    -- AND P.[Calendar_FSK] BETWEEN -2147483647 AND -2147483602
    
    
    SELECT *  FROM Permutations

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, November 30, 2017 6:34 AM
    Moderator