I received a huge database with one table containing 450 million row. The database is SQL Server 2005 and we are implementing Table partition. We received the data from our customer as a STAR schema. After studying the requirement of the my customer, I found they only want a record(s) from the table and input these records into a econometric model.
When I spoke to the so called DBA, he said the data retrieval will be a big problem and he said we should go for relational data schema (!).
When I researched on the web, I got the following points
Fact tables in star schema are mostly in third normal form (3NF), but dimensional tables are in de-normalized second normal form (2NF).
As you know, the primary schema for a data warehouse is either a star schema or a "normalized" schema. The latter is a term so loosely defined that it's hard to describe, but a normalized schema typically resembles a third (or higher) normal form (3NF) schema that's not dimensional. These 3NF designs don't support query and analysis. Their sole purpose is to act as a staging area, an upstream data repository for a series of star schemas, online analytic processing (OLAP) cubes, and other structures that are directly queried by analysts. The only routine exception to this is Teradata implementations: Because of the unique characteristics of the massively parallel architecture and database optimizer, Teradata can process analytical SQL against a 3NF schema with acceptable performance.
Because the 3NF approach is only a first step, the argument that schemas of this kind are perfect for data warehousing is misleading. The real-time landscape of an RTDW includes the whole process, or at least the process up to and including the physical structures that people query. Because the 3NF-centric approach just adds overhead, it is not, at this time, a good candidate for RTDW.