none
Tutorial to load a fact table.

    Question

  • Hello

    This is my first business intelligence project, could some one help me and show me how to load a fact table. (If there is a tutorial i will be grateful)

    I already loaded all my dimension tables.
    Sunday, July 19, 2009 7:27 PM

Answers

  • Hi,
    I dont know how you are doing everything but it it goes,
    I want to tell you that when you do Load a fact and dimension, First you use
    SSIS package to load your  all the dimensions required for that Facttable, because this will generate the PK for each dimension table and which will be refered into your fact table as a surrogate key.
    which will be combination of all these refered dimensions keys(PK -FK relationship between fact and dimension)
    Now once your dimensions are done (fully loaded) now you wannna go for the Fact atble load.

    There are few things i would like to ask and answer as well, how you want to deal with your fact table of new and upadted rows(are you going to do truncate and treload to your fact table or incremental data which are new only those,)
    This logic is important when you design your sdimension also(Remember Slowly changing dimensions type i,ii iii)

    Okay now come to fact load :(tWO WAYS TO LOAD THAT USING sql join and using look up since look up is very fast and optimized thats why i am explaining that method to load your fact table here )
    1. YOU should have a source of your fact table choose oledb source and specify the query instead of all columns 
    you should choose which are required for your load.
    2.Now once you have your source all the columns and all keys are avilable in the source OLEDB,
       You can rename the output columns (which are common in the dimension to avaoid the ambiguity)
    3.Now all your dimension should be looked up using the look up transformation each look for eaCH dimension key
     lets say you have dimensionkey in your dimension table and which is also in fact table
     now using that you have to look up all your there are three tab in the look up
    REFERENCE TAB, where you have to sepecify the query of your dimension table  which you are mtching on and getting corrosping Dimension key
    COLUMNS here you have to match the column based onw hich you want that Dimension key and select the key from the dimension (which you will do for each
                        dimension table and retrieve the value of each key and This is the method you will do for all the diimension
    But here is the key when you look up the table you have to IGNORE  instead of the error (configure error output)
    At the last  before loading all the fact you  will have all those columns generated from the look up and the source Fact table
    But you will have to select all the measures from the source but all the keys from the dimension look up 
    nOW BEFORE GOING TO LOAD THERE ,iT IS A GOOD PRACTICE TO CHECK that your data should have any null value(otherwise when you will process the cube you will get the most obvious erro attribute key could not find)
    to check that you can use a derive colyumn tarns to check taht like
    lets say you have dimension_key1 key columns are coming from dimension which doesnt have record corrospdimg in the Fact atble source then
    use EXPRESSION for derive columns before loading to fact
    isnull(dimension_key1) ? -1 :dimension_key1 and then load all the dimensions, This you can do to all dimension keys  and now direct the outputs to the fact destination.
    Hope this will help you

    dsu
    • Marked as answer by Tony Tang_YJ Friday, July 31, 2009 9:38 AM
    Tuesday, July 21, 2009 5:34 PM
  • Although I have not looked at the tutorial in question, you would generally use lookups to identify the surrogate key for the record in the dimension which ties to your fact table record.  You would do this for all relevant dimensions related to the fact table.

    If you are not familiar with the data warehouse dimensional modeling theories, you might want to take a look at "The Data Warehouse Toolkit (second edition)" by Ralph Kimball, or, "The Microsoft Data Warehouse Toolkit" by Kimball, Mundy and Thornethwaite.  As I mentioned above, the FAQ thread contains links to quite a bit of good information - such as the link to the webcast for populating a kimbal method data warehouse.

    Please mark answered posts. Thanks for your time.
    • Marked as answer by Tony Tang_YJ Friday, July 31, 2009 9:38 AM
    Tuesday, July 21, 2009 1:57 PM
    Moderator

All replies

  • hi,
    is there any specific requirement??

    please check the following link which may help you.
    http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/cf536fd2-3656-47ae-aee4-c905b6b81efa

    do let me know your observation.
    thanks-


    Let us TRY this | Don’t forget to mark the post(s) that answered your question
    Sunday, July 19, 2009 7:50 PM
  • the only requirement is that data are stored in a single excel file,

    That means i have different dimension tables (each one contains some columns), and finally i'm going to load the fact table from the same excel file to put it into a table in my Datawarehouse
    Sunday, July 19, 2009 8:03 PM
  • the only requirement is that data are stored in a single excel file,

    That means i have different dimension tables (each one contains some columns), and finally i'm going to load the fact table from the same excel file to put it into a table in my Datawarehouse
    well,
    do you have any problems with excel as datasource or ?
    Monday, July 20, 2009 6:56 AM
  • Please go through this link
    http://blogs.techrepublic.com.com/datacenter/?p=205

    let us know if you face any problem wile using the excel source and conenction manager
    Nitesh Rai- Disappointment is directly proportional to expectations
    Monday, July 20, 2009 8:15 AM
  • You might want to peruse the FAQ thread at the top of the page.  It lists many references to useful information, such as the link to the webcast for populating a Kimball Method Data Warehouse.

    Please mark answered posts. Thanks for your time.
    Monday, July 20, 2009 1:02 PM
    Moderator
  • I tried to apply instructions described here but i failed, (This method worked for me when i tried to load dimension tables)

    http://blogs.techrepublic.com.com/datacenter/?p=205

    As i read, the fact table must contain SurogateKey of each dimension?? is that right? so how an i use this tutorial to load the fact table..

    thanks to simplify it for me as much as you can, because my head is going to explode because of this.
    Monday, July 20, 2009 2:46 PM
  • Although I have not looked at the tutorial in question, you would generally use lookups to identify the surrogate key for the record in the dimension which ties to your fact table record.  You would do this for all relevant dimensions related to the fact table.

    If you are not familiar with the data warehouse dimensional modeling theories, you might want to take a look at "The Data Warehouse Toolkit (second edition)" by Ralph Kimball, or, "The Microsoft Data Warehouse Toolkit" by Kimball, Mundy and Thornethwaite.  As I mentioned above, the FAQ thread contains links to quite a bit of good information - such as the link to the webcast for populating a kimbal method data warehouse.

    Please mark answered posts. Thanks for your time.
    • Marked as answer by Tony Tang_YJ Friday, July 31, 2009 9:38 AM
    Tuesday, July 21, 2009 1:57 PM
    Moderator
  • Hi,
    I dont know how you are doing everything but it it goes,
    I want to tell you that when you do Load a fact and dimension, First you use
    SSIS package to load your  all the dimensions required for that Facttable, because this will generate the PK for each dimension table and which will be refered into your fact table as a surrogate key.
    which will be combination of all these refered dimensions keys(PK -FK relationship between fact and dimension)
    Now once your dimensions are done (fully loaded) now you wannna go for the Fact atble load.

    There are few things i would like to ask and answer as well, how you want to deal with your fact table of new and upadted rows(are you going to do truncate and treload to your fact table or incremental data which are new only those,)
    This logic is important when you design your sdimension also(Remember Slowly changing dimensions type i,ii iii)

    Okay now come to fact load :(tWO WAYS TO LOAD THAT USING sql join and using look up since look up is very fast and optimized thats why i am explaining that method to load your fact table here )
    1. YOU should have a source of your fact table choose oledb source and specify the query instead of all columns 
    you should choose which are required for your load.
    2.Now once you have your source all the columns and all keys are avilable in the source OLEDB,
       You can rename the output columns (which are common in the dimension to avaoid the ambiguity)
    3.Now all your dimension should be looked up using the look up transformation each look for eaCH dimension key
     lets say you have dimensionkey in your dimension table and which is also in fact table
     now using that you have to look up all your there are three tab in the look up
    REFERENCE TAB, where you have to sepecify the query of your dimension table  which you are mtching on and getting corrosping Dimension key
    COLUMNS here you have to match the column based onw hich you want that Dimension key and select the key from the dimension (which you will do for each
                        dimension table and retrieve the value of each key and This is the method you will do for all the diimension
    But here is the key when you look up the table you have to IGNORE  instead of the error (configure error output)
    At the last  before loading all the fact you  will have all those columns generated from the look up and the source Fact table
    But you will have to select all the measures from the source but all the keys from the dimension look up 
    nOW BEFORE GOING TO LOAD THERE ,iT IS A GOOD PRACTICE TO CHECK that your data should have any null value(otherwise when you will process the cube you will get the most obvious erro attribute key could not find)
    to check that you can use a derive colyumn tarns to check taht like
    lets say you have dimension_key1 key columns are coming from dimension which doesnt have record corrospdimg in the Fact atble source then
    use EXPRESSION for derive columns before loading to fact
    isnull(dimension_key1) ? -1 :dimension_key1 and then load all the dimensions, This you can do to all dimension keys  and now direct the outputs to the fact destination.
    Hope this will help you

    dsu
    • Marked as answer by Tony Tang_YJ Friday, July 31, 2009 9:38 AM
    Tuesday, July 21, 2009 5:34 PM