Snowflake Schema

A snowflake schema is a type of data warehouse design that is common in BI implementations. You may hear this mentioned quite a bit by the database side team, as well as the concept of star schemas.

These schemes can be created based on normalization forms. Normalization refers to the level of redundancy in the data stored in the database at the lookup levels. A design with no repeated data is called normalized. This would mean that there is a separate lookup table for every attribute in a hierarchy. A data structure with total redundancy would be called denormalized. A special case of this is called the star schema. The name comes from the visual look that happens when lookup levels are added on top of a fact table. Here’s an example of a completely normalized database structure:

The downside of normalized data is that it has the potential to create multiple joins, especially when attempting to report on data at the top level attributes. These additional joins can affect the SQL run time when generating the reports, which can increase performance. However, the table sizes are smaller, and only one set of data needs to be updated, if a high level attribute changes, for example.

Using a completely denormalized approach can reduce the number of joins and thus speed up the SQL performance, but you may quickly increase table sizes, especially if there are a lot of attribute elements. In addition, multiple tables may need to be updated in the event of an attribute change.

In practice, a mixed level of normalization may need to be used. This can offer a best of both worlds strategy to keep performance and maintenance in line. Another option is a special case of the denormalized view. This design only has one lookup table for all the attribute levels, and is called a star schema.

Only a thorough review of the data can see what the design requirements would be. From a business intelligence standpoint, most toolsets can handle various levels of snowflake schema design. Some vendors’ tools, such as Microstrategy, tend to perform better over a denormalized environment.

Go from Snowflake Schema to home

Contact Us * About Us

Copyright 2011-2017