Sunday, September 16, 2012 12:57 PMhow is the spider schema useful??? we use another table as intermediate table and this includes space more than star schema then how is that useful?????????? And i'm also in need of applications of spider schema??
- Edited by sindu26 Sunday, September 16, 2012 12:58 PM
Sunday, September 16, 2012 8:50 PMModerator
why you don't use Star Schema? Start schema is a reliable data warehouse schema working for many years and you can get very good performance from this schema. even if you want to do few snow flaking it will be still good when you combine it with SSAS cubes.
if this doesn't help you tell us what are your dimensions, and facts and their relationship and I will try to help you in the design
Monday, September 24, 2012 2:23 AM
Actually I didn't see any advantage of this spider schema. Because you will get the same problem with the intermediate table, like schema modification overhead when add in new dimensions, fact data reload etc. And the spider schema will involve in a new problem which is join performance overhead. You need one more time join with this schema. For fact tables with large volume of records, this will cause considerable performance issue.
I believe for data mart, star schema will be the best. But for data warehouse, maybe you can investigate snowflake schema.
Tuesday, September 25, 2012 8:11 PM
Removing keys from the Fact Table and moving those to another table (Intermediate Dimension) does not create more space. If you read the information at the site: Spider Schema you would see that the Intermediate Dimension contains a Distinct Set of Foreign Keys. Since Dimensional Foreign Keys are repeated over and over again in the Fact Table, you end up using Less Space.
Since the Spider Schema works for any BI Tool like Microsoft SQL Server Analysis Services, and any reporting tool including all of Microsoft's Suite of reporting tools, how does that create a "need of applications of spider schema"?
Tuesday, September 25, 2012 8:19 PMChen Liang ~ Is there a schema that exists where when you add in a new dimension or fact table or any table for that matter that it does not create a "Schema Modification"? You comment that: "the Spider Schema will involve a new problem: Join Performance Overhead". However when a fact table has millions/billions of records, and each join to it by a dimension creates a full table scan on the Fact Table is faster than first Joining in the Dimensions (using filters) on the Intermediate Dimension (which has fewer records than the fact table and is all Keys) first to reduce the number of records that is joined in on the Fact Table create overhead? I have done testing and there is no performance hit on most queries fetching records from a large Fact Table, if anything its faster using the Intermediate Dimension.