none
SSAS Large Data Sets

    Question

  • I am trying to build a cube with Large DS in the fact table.  The Sales table for 24 month has 200 Million Records.

    Here was my approach -

    I built a cube successfully with a query in the dataset considing of only the last 10 days and 1 partition.  The DS is an Oracle source and in the where clause, I am using "Where InvoiceDate <= sysdate-10"  (If anyone has a better way, please let me know - I have tried, "12-Jun-01", "2012-06-01" and these have not been succesful)

    So, after the cube build was successful, I went back and built a 24 month history by removing the condition in the where clause of the ds query.  Then I went to create partitions of my fact table (approx 8 of them) using queries of the fact table - then in the query syntax of the partition construct I would plave in the where clause, "Where InvoiceDate between (sysdate-730) and (sysdate-611)"  - I repeated this process many times to fill out approx quarterly partitions.

    When I go to process the dimension again,  and it Fails. 

    Errors and Warnings from Response
     Internal error: The operation terminated unsuccessfully.
     File system error: A FileStore error from WriteFile occurred. Physical file: \\?\R:\DAT1\OLAP\713_Test.0.db\P SALE ITEM.16.dim\30.SLS ID.asstore. Logical file: . .
     Errors in the OLAP storage engine: An error occurred while the 'ID' attribute of the 'SALEITEM' dimension from the '713_Test' database was being processed.
     Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.
     Server: The operation has been cancelled.


    KDW


    • Edited by siera_gld Saturday, July 21, 2012 6:17 PM
    Saturday, July 21, 2012 3:28 PM

Answers

  • Since you mention the .asstore file, it may be the 4GB string store limit. How many distinct values are there in P_SALE_ITEM.SLS_ID?

    http://social.msdn.microsoft.com/forums/en-US/sqlanalysisservices/thread/973f4d41-b9d9-4c2b-96d2-876ec1679f97/

    If in fact it's the 4GB string store limit, do you have the option of upgrading to AS2012? They have removed the 4GB string store limit in that version:
    http://technet.microsoft.com/en-us/library/gg471589(SQL.110).aspx

    If you do not have the option of upgrading, do users actually need to view the value of SLS_ID, or is it just an internal surrogate key used to relate your fact table to your dimensions? If users don't need to view it, then do the following:

    1. add a new named calculation to your P_SALE_ITEM table in the DSV. Name the named calculation EmptyString, and use the expression '' (two single quotes). That will create a new column on your DSV table returning the same empty string value for every row.
    2. On the properties for the SLS_ID attribute, mark that attribute as AttributeHierarchyVisible=false
    3. On the properties for the SLS_ID attribute, make sure the KeyColumns is propertly set to SLS_ID. Change the NameColumn to use the EmptyString column.

    The above workaround works since the key of an attribute will retail the integer datatype of the column, but the NameColumn of an attribute is always stored as a string. In this case, I suspect that in converting the high cardinality integer to a string, you're overflowing the 4GB string store limit.


    http://artisconsulting.com/Blogs/GregGalloway

    Monday, July 23, 2012 6:23 PM
    Moderator

All replies

  • Your SSAS database has somehow become corrupt - you haven't done anything wrong, this is the result of a bug in SSAS. Try deleting the database in SQL Server Management Studio then redeploying and reprocessing; you should also make sure you have installed the most recent service pack for whatever version of SSAS you have installed to reduce the chances of it happening again.

    HTH,

    Chris


    Blog: http://cwebbbi.wordpress.com/ Consultancy: http://www.crossjoin.co.uk/

    Monday, July 23, 2012 8:01 AM
    Moderator
  • I've re-deployed cubes 2-3 times after deleting


    KDW

    Monday, July 23, 2012 3:16 PM
  • Since you mention the .asstore file, it may be the 4GB string store limit. How many distinct values are there in P_SALE_ITEM.SLS_ID?

    http://social.msdn.microsoft.com/forums/en-US/sqlanalysisservices/thread/973f4d41-b9d9-4c2b-96d2-876ec1679f97/

    If in fact it's the 4GB string store limit, do you have the option of upgrading to AS2012? They have removed the 4GB string store limit in that version:
    http://technet.microsoft.com/en-us/library/gg471589(SQL.110).aspx

    If you do not have the option of upgrading, do users actually need to view the value of SLS_ID, or is it just an internal surrogate key used to relate your fact table to your dimensions? If users don't need to view it, then do the following:

    1. add a new named calculation to your P_SALE_ITEM table in the DSV. Name the named calculation EmptyString, and use the expression '' (two single quotes). That will create a new column on your DSV table returning the same empty string value for every row.
    2. On the properties for the SLS_ID attribute, mark that attribute as AttributeHierarchyVisible=false
    3. On the properties for the SLS_ID attribute, make sure the KeyColumns is propertly set to SLS_ID. Change the NameColumn to use the EmptyString column.

    The above workaround works since the key of an attribute will retail the integer datatype of the column, but the NameColumn of an attribute is always stored as a string. In this case, I suspect that in converting the high cardinality integer to a string, you're overflowing the 4GB string store limit.


    http://artisconsulting.com/Blogs/GregGalloway

    Monday, July 23, 2012 6:23 PM
    Moderator
  • Thank you very much -

    The SLS ID is a compound PK and in thinking about it - you may be correct because the Date is part of the pk which may be a large string by the time you get done with 200 mil records.

    I'll try and let you know.


    KDW

    Tuesday, July 24, 2012 4:00 PM
  • Still Erred -

    Wonderful message below -

    Errors and Warnings from Response
     Internal error: The operation terminated unsuccessfully.
     Server: The operation has been cancelled.
     Errors in the OLAP storage engine: An error occurred while the 'SLS ID' attribute of the 'P SALE ITEM' dimension from the 'freprs' database was being processed.
     Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.

    I guess I will modify the build to log the errors and just to continue building then read the log file


    KDW

    Wednesday, July 25, 2012 12:38 PM
  • Hmm. If you just do a ProcessFull on that one dimension, does it succeed? Without seeing the full processing log, I can't be sure. But I think there's at least a chance that a different dimension failed and that failure on another thread causes the thread processing the P SALE ITEM dimension to be cancelled. Are there any other warnings or errors in the processing log?

    The other question I would ask is why you are combining date and SLS_ID. Are you putting date (month, year) and sls_id attributes in one dimension? How many SLS_ID values are there and what attributes do you need in that dimension which aren't directly related to that ID?


    http://artisconsulting.com/Blogs/GregGalloway

    Wednesday, July 25, 2012 2:31 PM
    Moderator
  • Hi Furmangg

    I tried to implement the "EmptyString" solution. The good part is cube processed successfully but when I try to browse the cube I get below error.

    ERROR: Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG))

    Could you please let me know your thoughts on this.

    Thanks..............

    Saturday, July 28, 2012 2:00 AM
  • Not completely in line with the current topic direction but you can use BETWEEN TO_DATE("2012/01/01", "yyyy/MM/dd") AND TO_DATE("2012/03/31", "yyyy/MM/dd") statement to specify a quarterly partition. Much better than the SYSDATE approach as you'll have better control over the exact date ranges the partitions cover and the SLICE property probably should be used to specify which regions a partition addresses.

    What's your dimensional structure look like? Sorry for interrupting.

    Martin

     

    http://martinmason.wordpress.com

    Saturday, July 28, 2012 12:31 PM
  • Great news that your dimension successfully processes now. Glad the workaround worked out.

    I think that cube browser error you're receiving is a problem with the OWC install. This thread suggests trying a repair on OWC. Regardless, I don't think it's related to your dimension issues. (If you want to verify that, deploy another cube like Adventure Works and make sure you get the same error in the cube browser... or try the cube browser on another computer.)
    http://social.msdn.microsoft.com/forums/en-US/sqlanalysisservices/thread/07a10fae-3133-4081-b747-338cd767a557/


    http://artisconsulting.com/Blogs/GregGalloway

    Saturday, July 28, 2012 5:31 PM
    Moderator
  • Hi Furmangg

    Thanks for posting that link. I have it repaired a now I see the same values for different months and years whereas previously without the EmptyString fix it used to show appropriate numbers for different months. Please suggest.

    Thanks............

    Monday, July 30, 2012 3:07 PM
  • Glad you OWC is fixed.

    I'm not sure what the issue is now. You're saying that the same measure value is repeating for every month? If you look at the Dimension Usage tab, is that Date dimension properly connected to your measure group?

    I assume that you only changed the NameColumn to use the EmptyString column, and that you left the KeyColumns pointing at your SLS_ID?


    http://artisconsulting.com/Blogs/GregGalloway

    Monday, July 30, 2012 4:12 PM
    Moderator
  • no - not resolved - and the answer should NEVER BE - just upgrade to 2012....

    this is a large company - we just got to 2008 - 2012 will not happen for 5 years...


    KDW

    Thursday, August 9, 2012 9:51 PM
  • To go back to what Greg asked a while back, I think the real question is why you need to build this dimension. Even if you didn't get any errors while processing, you might well find that processing was slow. What do you need the dimension for? There may be a way of getting the same functionality in a different way.

    Chris


    Check out my MS BI blog I also do SSAS, PowerPivot, MDX and DAX consultancy and run public SQL Server and BI training courses in the UK

    Friday, August 10, 2012 10:40 AM
    Moderator