Finding the Grain in Your Database

This page is in support of my course “Star Schema Foundations” on Pluralsight. For more information you can visit Pluralsight.com for a free trial or my author page to see all of my courses.

Module Notes

Finding the grain is key in early going of your Star Schema development. I recommend a three step approach to finding the grain:

Understand your business process
Define the grain at the atomic level
Analyze your data sources for a reality check

Understanding your Business Process

In this step you'll want to work closely with your internal customers to truly understand their need. Documentation is good if you build it together. The days of sending a giant requirements document to someone are over, or should be. Rather, I recommend that you sit down w/ your customer and talk through their process. If possible, actually go sit with the people executing this process and take notes on points that your solution could help them improve.

Define the Grain at the Atomic Level

The atomic level is the lowest level possible. This is ideal because from there we can derive any aggregation. This is also why transaction fact tables are the most expressive types of Star Schemas (ref).

An example would be the invoice line item, not the invoice header. Capture the data at the actual grain at which it occurs. You scan each boarding pass for the airplane, not the entire flight at once. Later you may want to make higher level aggregations for easier analysis which is great, just don't lose the detail.

Analyze your Data Sources for a Reality Check

How a business person describes their process and how the data is stored are generally different. Sometimes you'll find many people use local Excel spreadsheets for where their systems don't support their data needs. It's critical that you understand all the sources for your Star Schema so you can accurately represent their process in the database.

Star Schema Foundations Downloads

References

Kimball Group Defining the Grain (in a Star Schema)