Using this approach we are maintaining the history of the changing attributes. We are maintaining the history of the records by different ways.
Ø
Using start date and end date
Ø
Using IsActive flag (True & False)
Ø
Using Yes or No values
Let’s see
the example
In this demo we will see SCD type using
Start date and End date
Maintaining the history using start date
and end date
In the
customer dimension below are the records
ID |
CUS_ID |
Name |
Year |
Address |
Start_Date |
End_date |
1 |
1 |
Bagesh Kumar Singh |
2018 |
Pune |
4/20/2018 |
Null |
2 |
2 |
Mahesh |
2018 |
Chennai |
4/20/2018 |
Null |
Now are getting below records in the sating table and we are
updating the customer dimension
CUS_ID |
Name |
Year |
Address |
1 |
Bagesh Kumar Singh |
2019 |
Greater Noida |
3 |
Suresh |
2020 |
Mumbai |
Now our Customer dimension will be
ID |
CUS_ID |
Name |
Year |
Address |
Start_date |
End_date |
1 |
1 |
Bagesh Kumar Singh |
2018 |
Pune |
4/20/2018 |
06/27/2019 |
2 |
2 |
Mahesh |
2018 |
Chennai |
4/20/2018 |
Null |
3 |
3 |
Suresh |
2020 |
Mumbai |
06/27/2019 |
Null |
4 |
1 |
Bagesh Kumar Singh |
2019 |
Greater Noida |
06/27/2019 |
Null |
When a new record is created in the dimension table in that
case we are storing the Start_date as Today date and Keeping End_date as Null
or any highest default date (12/31/9999). Here we are using the End date as
null.
If the existing records comes with the updated value in that
case first we are updated the Today date into the end_date column and then
inserting the new record in the dimension table with start_date as Today date
and End_date as null.
Those records are having End_date as null it means that
records are the Active records.
Below are the tables
CREATE TABLE dimcustomer_stg_scd_2 (
dimcustomer_stg_scd_2_id INT IDENTITY(1, 1) PRIMARY KEY, cus_id INT, cus_nm VARCHAR(50), cus_year VARCHAR(10), cus_add VARCHAR(500) ) CREATE TABLE dimcustomer_scd_2 ( dimcustomer_scd_2_id INT IDENTITY(1, 1) PRIMARY KEY, cus_id INT, cus_nm VARCHAR(50), cus_year VARCHAR(10), cus_add VARCHAR(500), start_date DATE, end_date DATE ) |
Inserting some records in to the stage and DimCustomer
table.
INSERT INTO dimcustomer_scd_2 |
Now see the records in both tables.
Using the below script we are loading the data from stage to
main table.
-- This inserts another record to the dimension for SCD Type
changes INSERT INTO dimcustomer_scd_2 (
cus_id,
cus_nm,
cus_year,
cus_add, start_date,
end_date ) SELECT cus_id, cus_nm, cus_year, cus_add, start_date, end_date FROM (MERGE dimcustomer_scd_2 TT using
dimcustomer_stg_scd_2 ST ON (
tt.cus_id= st.cus_id) -- This inserts new records in the dimension table WHEN NOT matched THEN INSERT (
cus_id,
cus_nm,
cus_year,
cus_add,
start_date,
end_date ) VALUES (
cus_id,
cus_nm,
cus_year,
cus_add,
Getdate(),
NULL ) -- This marks the older record to be outdated for SCD Type 2 WHEN matched AND tt.cus_add <> st.cus_add AND tt.end_date IS NULL THEN UPDATE SET tt.end_date = Getdate() output $action Action_Taken,
st.cus_id,
st.cus_nm,
st.cus_year,
st.cus_add,
Getdate() AS start_date,
NULL AS end_date ) AS merge_out WHERE merge_out.action_taken = 'UPDATE'; |
Now running this script.
Cus_id 1 is updated with new value and old record end date
with updated with current date.
Let’s see again cus_id= 1 is coming with updated address then current records will become the historical records and stage record will be the current record.
Running the above script.
Get the expected result.
No comments:
Post a Comment
If you have any doubt, please let me know.