Thursday 25 July 2019

Extract Data from the table into the CSV file in Chunks of 10000 using SSIS Package


We have a table and we need to extract the data from it, into the CSV file in chunks of 10000 records. Suppose we in the file if we have 50k records in this case 5 CSV file will be created.
Let’s see the demo.
Taking Execute SQL task   

With the help of this task, we will get the number of records in the table from which we are pulling the data. 
Taking result set as a
single row because it will return one row as count.
Storing this value into a variable.


Once we get the number of records we need to iterate and extract the records from the table and write it into the CSV files.
For iteration I am taking for loop container. Before it, I am taking some variables which are used in for loop container.

Now I am taking for loop container
                 
           
Here I am iterating until and unless our flag is true.
In this container I am taking script task to variables assign
           
Below are the variables.       
     
In the script I am assigning these variables
int ck = int.Parse(Dts.Variables["User::chunks"].Value.ToString()) + 1; ;
            Dts.Variables["User::chunks"].Value = ck;
            Dts.Variables["User::n"].Value = int.Parse(Dts.Variables["User::number_of_row_in_file"].Value.ToString())+
                int.Parse(Dts.Variables["User::n"].Value.ToString());

            Dts.Variables["User::Number_of_records"].Value = int.Parse(Dts.Variables["User::Number_of_records"].Value.ToString())
                - int.Parse(Dts.Variables["User::number_of_row_in_file"].Value.ToString());

            if(int.Parse(Dts.Variables["User::Number_of_records"].Value.ToString())>0)
            {
                Dts.Variables["User::Flag"].Value = true;
            }
            else
            {
                Dts.Variables["User::Flag"].Value = false;
            }
To pulling the records from the database I am using pagination concept. Here I am creating an SP in that I am setting the offset row values and next records.


In the above C# code I and setting the offset row and next records value which are pass in the next Execute SQL task to pull the data from the database.
SP code
CREATE PROCEDURE  P_page_split
 (
 @PageNo INT,
 @RowCountPerPage INT
 )
AS
Select
SalesOrderNumber,
SalesAmount,
UnitPrice,
ExtendedAmount,
TaxAmt
from FactInternetSales
order by SalesOrderNumber
OFFSET (@PageNo - 1) * @RowCountPerPage ROWS
FETCH NEXT @RowCountPerPage ROWS ONLY

Now I am taking Execute SQL task and calling this SP.                 
Creating a connection             
Here I am calling SP with Two parameter
Parameter 1 : Page number (User:: chunks – start from 1. it will be calculated based on the number of records in the table in the above script task)
Parameter 2: Row count per page (User::number_of_row_in_file -10000)         

Now we need to store this result set in the object. 

Once we get the result set we need to write these data into the CSV file.
For writing the data into CSV file I am using the script task.   
Variable we used                   


Below are the C# code to read the object variable and write it into the csv files.
string file_path = Dts.Variables["User::File_path"].Value.ToString() + Dts.Variables["User::chunks"].Value.ToString()+"_"+ DateTime.Now.ToString("yyyyMMddHHmmss") + ".csv";

            using (StreamWriter writer = new StreamWriter(@file_path, true))
            {
                writer.WriteLine("SalesOrderNumber,SalesAmount,UnitPrice,ExtendedAmount,TaxAmt");
                OleDbDataAdapter A = new OleDbDataAdapter();
                System.Data.DataTable dt = new System.Data.DataTable();
                A.Fill(dt, Dts.Variables["User::DataSet"].Value);
            
                foreach (DataRow row in dt.Rows)
                {
                    object[] array = row.ItemArray;

                   
                    string SalesOrderNumber;
                    string SalesAmount;
                    string UnitPrice;
                    string ExtendedAmount;
                    string TaxAmt;

                    SalesOrderNumber = array[0].ToString();
                    SalesAmount = array[1].ToString();
                    UnitPrice = array[2].ToString();
                    ExtendedAmount = array[3].ToString();
                    TaxAmt=array[4].ToString();

                    writer.WriteLine(SalesOrderNumber + "," + SalesAmount + "," + UnitPrice + "," + ExtendedAmount +","+ TaxAmt);
                }

                writer.WriteLine("Number of records written into this file : " + dt.Rows.Count.ToString());
            }
Build it and close C# window.
Now my package I ready to run.   

Before running this package. Records in the database.
   

And my chunk size is 10000. So it will create 7 files.
6*10000=60000 and 1 file having 398 records.
See the file destination folder.               
Now running this package.


Package executed successfully.
Now see the destination.
 

7 files created.
Number of records into file 1 to 6 are


And in the 7th file 398
    

Hope this will help to understand one use of the pagination.

1 comment:

If you have any doubt, please let me know.

Popular Posts