Saturday 21 May 2022

Implementation of slowly changing Dimension Type-1 Overwriting the old value using SQL Script

Using this approach we are updating the existing records with updated value without maintaining any history. Insert a new record into the dimension table if that record does not exist in the dimension table. It is very easy and saving huge amount of space because we are not maintaining any history.

For example

In the customer dimension below are the records

ID

Name

PAN_Number

Aadhar_Number

1

Bagesh Kumar Singh

12345

44444

2

Mahesh

22222

66666

Now are getting below records in the sating table to update the PAN  and Aadhar number in the customer dimension

ID

Name

PAN_Number

Aadhar_Number

1

Bagesh Kumar Singh

55555

33333

3

Suresh

77777

88888

Now our Customer dimension will be like below. PAN and Aadhar number are not updated for the customer ID 1.

ID

Name

PAN_Number

Aadhar_Number

1

Bagesh Kumar Singh

55555

33333

2

Mahesh

22222

66666

3

Suresh

77777

88888

In the above load we get one existing ID with updated value and one new record. Here is not mainlining any history. If anyone want to know, what was the Pan or Aadhar number of Bagesh (ID: 1) previously. We can’t get it. In this approach only we are getting latest value in the dimension table.

We are using such approach where we know that no historical information is required in the Data ware house and reports.

Advantage of Type 1

·         This is the easiest way to handle the SCD problem.

·         It is saving Hug amount of spaces in the database because we are not saving the Historical information of the dimension.

Disadvantage of Type 1

·         We are not having the historical information. In case someone updating the dimension records we may not able to recover the updated information.

 Let’s see the implementation of SCD Type 1 using SQL Script.

 There are multiple way to implement it but here we are using below method to load the data in Dimension table.

Ø  Using Merge statement

Ø  Using Inner and Left join

o   Inner join to update the existing records

o   Left join to identify the new records and inserting these records in the dimension table.

 

Using Merge statement

Update the record if cus_id is exist in destination table and any changes in PAN or Aadhar Number and insert the destination if no record is exits.

MERGE dimcustomer T

using dimcustomer_stg S

ON T.cus_id = s.cus_id

WHEN matched AND ( T.pan_num <> S.pan_num

                                                                OR T.aadhar_num<>s.aadhar_num) THEN

  UPDATE SET T.pan_num = S.pan_num,

             T.aadhar_num = s.aadhar_num

WHEN NOT matched BY target THEN

  INSERT

  VALUES (S.cus_id,

          S.cus_nm,

          S.pan_num,

          S.aadhar_num);

Records in the in Dimcustomer and staging table.

  

In stage we are getting one record to update and one new record. Now we are running this script load the data into the designation table.   


Two rows are affected. See the record of DimCustomer table.  


Pan and Aadhar both are updated successfully.

Using Inner and Left join

Ø  Using Inner and Left join

o   Inner join to update the existing records

o   Left join to identify the new records and inserting these records in the dimension table.

Query to update the record

UPDATE dim

SET    dim.pan_num = stg.pan_num,

       dim.aadhar_num = stg.aadhar_num

FROM   dimcustomer dim

       JOIN dimcustomer_stg stg

         ON dim.cus_id = stg.cus_id

WHERE  dim.pan_num <> stg.pan_num

        OR dim.aadhar_num <> stg.aadhar_num;

Using Left join to identify the new record to insert the new record into the destination table.

INSERT INTO dimcustomer

            (cus_id,

             cus_nm,

             pan_num,

             aadhar_num)

SELECT stg.cus_id,

       stg.cus_nm,

       stg.pan_num,

       stg.aadhar_num

FROM   dimcustomer_stg stg

       LEFT JOIN dimcustomer dim

              ON dim.cus_id = stg.cus_id

WHERE  dim.dimcustomer_id IS NULL;

Before running the script records in the table.

 

Now we are running the script

   

See the record in the destination table.

 

Get the expected result.


No comments:

Post a Comment

If you have any doubt, please let me know.

Popular Posts