In this demo we are using the Archival table.
In this
approach we are having two table main tables and the historical or Archive
table to maintain the history. Basically we are using the logic of SCD type 2 (Maintaining the history using start date
and end date) only the difference
between them in type two we are storing the historical record in to the same
dimension table but in the SCD Type 4 Approach we are storing the current
active record into the main table and the historical record into the another
Historical table or Archive table.
See the
example.
We have
DimCustomer_scd_4 (main table), DimCustomer_scd_4_Archive (Archive table) and
the DimCustomer_STG_scd_4 (Stage table).
In the
customer dimension below are the records
Table :
DimCustomer_scd_4
ID
|
CUS_ID
|
Name
|
Year
|
Address
|
Start_Date
|
End_date
|
1
|
1
|
Bagesh Kumar Singh
|
2018
|
Pune
|
4/20/2018
|
Null
|
2
|
2
|
Mahesh
|
2018
|
Chennai
|
4/20/2018
|
Null
|
Table : DimCustomer_scd_4_Archive
ID
|
CUS_ID
|
Name
|
Year
|
Address
|
Start_Date
|
End_date
|
|
|
|
|
|
|
|
Now are getting below records in the sating table and we are
updating the customer dimension
Table : DimCustomer_STG_scd_4
CUS_ID
|
Name
|
Year
|
Address
|
1
|
Bagesh Kumar Singh
|
2019
|
Greater Noida
|
3
|
Suresh
|
2020
|
Mumbai
|
Now our Customer dimension will be
Table :
DimCustomer_scd_4
ID
|
CUS_ID
|
Name
|
Year
|
Address
|
Start_date
|
End_date
|
2
|
2
|
Mahesh
|
2018
|
Chennai
|
4/20/2018
|
Null
|
3
|
3
|
Suresh
|
2020
|
Mumbai
|
06/27/2019
|
Null
|
4
|
1
|
Bagesh Kumar Singh
|
2019
|
Greater Noida
|
06/27/2019
|
Null
|
Table : DimCustomer_scd_4_Archive
ID
|
CUS_ID
|
Name
|
Year
|
Address
|
Start_date
|
End_date
|
1
|
1
|
Bagesh Kumar Singh
|
2018
|
Greater Noida
|
4/20/2018
|
06/27/2019
|
When a new record is created in the dimension table in that
case we are storing the Start_date as Today date and Keeping End_date as Null
or any highest default date (12/31/9999). Here we are using the End date as
null.
If the existing records comes with the updated value in that
case first we are we are inserting the record into the archive table with
filling End_date as today date and then deleting the record from the Main
table. In the main table (DimCustomer_scd_4) we always get latest records.
Let’s see the demo.
Here we are creating 3 tables.
CREATE TABLE dimcustomer_stg_scd_4
(
dimcustomer_stg_scd_2_id INT IDENTITY(1, 1) PRIMARY KEY,
cus_id INT,
cus_nm VARCHAR(50),
cus_year VARCHAR(10),
cus_add VARCHAR(500)
)
CREATE TABLE dimcustomer_scd_4
(
dimcustomer_scd_4_id INT IDENTITY(1, 1) PRIMARY KEY,
cus_id INT,
cus_nm VARCHAR(50),
cus_year VARCHAR(10),
cus_add VARCHAR(500),
start_date DATE,
end_date DATE
)
CREATE TABLE dimcustomer_scd_4_archive
(
dimcustomer_scd_4_archive_id INT IDENTITY(1, 1) PRIMARY KEY,
cus_id INT,
cus_nm VARCHAR(50),
cus_year VARCHAR(10),
cus_add VARCHAR(500),
start_date DATE,
end_date DATE
)
|
Inserting few records into the Stgaing and main table.
INSERT INTO dimcustomer_scd_4
(cus_id,
cus_nm,
cus_year,
cus_add,
start_date)
VALUES (1,
'Bagesh Kumar Singh',
'2018',
'Pune',
'4/20/2018');
INSERT INTO dimcustomer_scd_4
(cus_id,
cus_nm,
cus_year,
cus_add,
start_date)
VALUES (2,
'Mahesh',
'2018',
'Chennai',
'4/20/2018');
INSERT INTO dimcustomer_stg_scd_4
(cus_id,
cus_nm,
cus_year,
cus_add)
VALUES (1,
'Bagesh Kumar Singh',
'2022',
'Greater Noida');
INSERT INTO dimcustomer_stg_scd_4
(cus_id,
cus_nm,
cus_year,
cus_add)
VALUES (3,
'Suresh',
'2022',
'Mumbai');
|
See the records into the table.
Below script is used to implement the SCD type : 4
--insert the updated record into archive and delete it from Main
table
DELETE main
output deleted.cus_id,
deleted.cus_nm,
deleted.cus_year,
deleted.cus_add,
deleted.Start_date,
Getdate()
INTO dimcustomer_scd_4_archive
FROM
dimcustomer_scd_4 main
JOIN dimcustomer_stg_scd_4 stg
ON stg.cus_id = main.cus_id
WHERE stg.cus_add <> main.cus_add
AND main.end_date IS NULL;
-- Insert the records into the main table from stage table.
INSERT INTO dimcustomer_scd_4
(cus_id,
cus_nm,
cus_year,
cus_add,
start_date)
SELECT stg.cus_id,
stg.cus_nm,
stg.cus_year,
stg.cus_add,
Getdate()
FROM
dimcustomer_stg_scd_4 stg
LEFT JOIN dimcustomer_scd_4 main
ON stg.cus_id = main.cus_id
WHERE main.dimcustomer_scd_4_id IS NULL;
|
Now running this query.After run this query sees the records
into the table.
For the cus_id =1 we are getting the updated value in this
case previous record we are inserting this record into the archive table and
deleting that record into the main table and inserting the latest record into
the main table.