What is a Fact Table?

A fact table is a table that joins dimension tables with measures.  For example, Lets say you wanted to know the time worked by employees, by location, by project and by task.  If you had a dimension for employees, location, project and task you would create a composite primary key using these foreign keys and add an additional column for the time worked measure. (more on measures in a little bit)

Keep in mind that fact tables are huge unlike dimension tables.  Fact tables are usually built to contain a hundred thousand records on the low side up to billions of records on the high side.  Therefore, these tables must be normalized to be efficient.

A fact table is normalized when it consists of the following two things:

  1. A composite primary key using the foreign keys to the dimension tables.
  2. Measure(s) used for numerical analysis.

 

Fact tables contain the data corresponding to a particular business process. Each row represents a single event associated with that process and contains the measurement data associated with that event. For example, a retail organization might have fact tables related to customer purchases, customer service telephone calls and product returns. The customer purchases table would likely contain information about the amount of the purchase, any discounts applied and the sales tax paid.

The information contained within a fact table is typically numeric data and it is often data that can be easily manipulated, particularly by summing together many thousands of rows. For example, the retailer described above may wish to pull a profit report for a particular store, product line or customer segment. The retailer can do this by retrieving information from the fact table that relates to those transactions meeting the specific criteria and then adding those rows together.

Fact tables contain keys to dimension tables as well as measurable facts that data analysts would want to examine. For example, a store selling automotive parts might have a fact table recording a sale of each item. The fact table of an educational entity could track credit hours awarded to students. A bakery could have a fact table that records manufacturing of various baked goods.

Fact tables can grow very large, with millions or even billions of rows. It is important to identify the lowest level of facts that makes sense to analyze for your business this is often referred to as fact table “grain”. For instance, for a healthcare billing company it might be sufficient to track revenues by month; daily and hourly data might not exist or might not be relevant. On the other hand, the assembly line warehouse analysts might be very concerned in number of defective goods that were manufactured each hour. Similarly a marketing data warehouse might be concerned by the activity of a consumer group with a specific income-level rather than purchases made by each individual.

Fact Table Grain

When designing a fact table, developers must pay careful attention to the grain of the table — the level of detail contained within the table. The developer designing the purchase fact table described above would need to decide, for example, whether the grain of the table is a customer transaction or an individual item purchase. In the case of an individual item purchase grain, each customer transaction would generate multiple fact table entries, corresponding to each item purchased. The choice of grain is a fundamental decision made during the design process that can have significant impact on the business intelligence effort down the road.

Leave a Reply

Your email address will not be published. Required fields are marked *