Sunday, 14 July 2019

Use of staging database or table in SSIS


SSIS is the ETL tool. The main job of ETL is to extract the data from the heterogeneous sources and do the transformation (perform the data clean and apply the business rules) and finally loading these data to the destination (Data warehouse or Data marts or files etc.)

The source may be an SAP system or mainframe system or other systems. These systems are live in production. It is not possible to hit these servers during the business hours to pull the data. We are pulling the data from these systems during the overnight. System admin allows us some time window in that time window we need to extract the data from the source.

In such type of scenario, we only extract the data from the source and we are not applying and business rules only loading data from these systems and loading into the staging table. Another set of the job will run and pull the data from the staging table and do the data clean and apply the business run and load it into the destination. In such type of scenario, we are using the stage database or table.

Another use of the staging table, when we are working with large CSV or excel file. In this case, first of all, we load the file data into the staging table and then from the stage, we will use for the next step.
  

Popular Posts