Slowly Changing Dimensions (SCD) are the dimensions that change slowly over time, rather than changing at a particular interval of the time. In Data Warehouse, there can be the need for keeping track of such changes as historical data. Thus implementing one of the Slowly Changing Dimension will help to enable its customers in assigning the proper dimension attribute for given date. Examples of such Dimensions can be address, employer, salary, etc.
SCDs are considered and implemented as the most important ETL tasks in tracking the history of the slowly changing records/dimensions.
There are three different types of slowly changing dimensions (SCD). To describe those three types let’s take an example.
Robert is the customer of the Verizon Mobiles. He first lived in Fairfax, Virginia. Thus, his record in the customer table is recorded as follows
Due to some reason Robert moved to Baltimore, Maryland. Now the Verizon Mobiles got a Slowly Changing Dimension scenario. Now they have three choices to record this change.
Type 1: They can replace that particular record of Robert with the new updated record (delete and replace)
Type 2: They can add one more row in the customer table adding the updated data
Type 3: They can modify the same record to reflect the changes for Robert
Let’s discuss on these three scenarios as the three types of SCDs:
- Type 1 Slowly Changing Dimension
In this type of the SCD, only the present data will be maintained/stored in the database. That means no history has been maintained.
In the above example of Robert, the record will be stored as:
Advantage of the SCD Type 1
- Since the history of the data is not maintained, they are easy to handle and are not much complex.
Disadvantage of the SCD Type 1
- As the history of the record is not maintained, the historical data is lost. It will not be possible to retain that historical data in the future
When to Use SCD Type 1
- They should be used when there is no necessity of recording the historical changes in the Data Warehouse.
- Type 2 Slowly Changing Dimension
In this type of the SCD, the history of the dimension will be maintained along with current data. The new record will get its primary key.
Let’s come back to the example of Robert again:
The original record was recorded as:
Since he moved to Maryland, there is change in the name of the State, thus, this change will be recorded as:
Advantage of the SCD Type 2
- This type of SCD allows us to keep all the historical records as well as the present records
Disadvantage of the SCD Type 2
- The ETL process might become Complicated with this SCD type
- Since there will be extra records, it will consume more space in the database, and this might create storage and performance related issues.
When to Use SCD Type 2
- They should be used when there is the necessity of recording the historical changes in the Data Warehouse to track the changes.
- Type 3 Slowly Changing Dimension
In this type of the SCD, the original record will be updated so as to reflect the current changes in the particular record. This type of SCD stores the current updated record as well as partial history. There will also be a column indicating when the present value will become effective.
In our example of Robert, original record was recorded as:
As per the Type 3 Slowly Changing Dimension module, we will now have the following columns:
- Customer ID
- Original State
- Current State
- Effective Date (Date when Robert Moved to Maryland)
Thus, Robert’s record will be stored as:
|Customer ID||Name||Original State||Current State||Effective Date|
Advantage of the SCD Type 3
- This type of SCD allows us to keep some partial history of the record
- Size of the table in the database is not increased since the same record has been updated
Disadvantage of the SCD Type 3
- When the change happens more than one times then, SCD type 3 doesn’t store all the change. For example, on 02/01/2016 if Robert moved to Washington DC from Maryland, then the table will store the change related to Washington DC and the history related to Maryland will be lost.
When to Use SCD Type 3
- This type of SCD can be used when we are sure that the dimension will change for the fixed number of times.