Data Warehousing #4: Star Schemas

Data Warehousing #4: Star Schemas

This is the 4th in series of posts on datawarehousing.  To see the entire post list, click here.

Star Schemas

A star schema is the simplest type of data mart in dimensional modeling.  A star schema is one or more fact tables foreign key’d to any number of dimensions, and, when viewed through a visual schema planning tool, looks like a star:

—————————————————————————-
|_______dim_user ________dim_date______________|
|_____________\ __________ / ___________________|
|____dim_product —  fac_purchase — dim_order______|
|_____________/ ___________\___________________|
|_____dim_channel________dim_marketing_campaign_|
—————————————————————————-

Note that each dimension foreign key’d in a star schema should represent a conformed dimension, which has the same meaning to every fact with which it relates.  This allows facts and measures to be rolled up, selected, and described in the same way across data marts.

In a star schema, tables are denormalized to be only one foreign key away from the facts.

Star Schemas vs. Snowflake Schemas

In a star schema, each dimension is only one foreign key away from the fact because the tables are denormalized.   Snowflake schemas are similar to star schemas, but for the normalizing the dimensional tables.   This results in a schema that looks much more like a snowflake:

————————————————————–
|____dim_address__________dim_month____|
|_______\________________/_____________|
|_____dim_user________ dim_date_________|
|_______\_____________/_______________|
|_dim_product —  fac_purchase — dim_order_|
|____________________\________________|
|______________dim_marketing_campaign_|
|______________________\______________|
|__________________dim_channel_________|
————————————————————–

Design considerations

Many believe that the speed of data retrieval is the most important design consideration, and thus prefer star schemas.  Others still feel that normalization allows easier discovery of data.

Many tools that work natively with dimensional models will natively support either/or type of schema.  Snowflake schemas are better with more sophisticated query tools that create a layer of abstraction between the tables themselves and end -users.

You can follow along with the data warehouse post series here.

 

Comments are closed.