Monday 8 February 2016

Remove Duplicate Records in SSIS Package

Suppose you are having the flat file in that file there are duplicates records and you want to load only unique record on the destination and duplicate record on the other destination.
Here I am going to take a simple example where we have a flat file which is having some duplicate records and I want to load unique records in a table and duplicate records on other table.
Below is the flat file.


Now I am creating two tables one is the tblEmp (having unique records) and other tbldupEmp (having duplicate records).
CREATE TABLE [dbo].[tblEmp]
       (
       [EmpID] [nchar](10) NOT NULL PRIMARY KEY ,
       [FName] [nchar](50) NULL,
       [Lname] [nchar](30) NULL,
       [Mob] [nchar](10) NULL,
       [Address] [nchar](100) NULL,
        )
Inserting some records
  Insert into tblEmp values('1','Bagesh','Singh','888880XXXX','pune')
  Insert into tblEmp values('2','Rajesh','kumar Singh','888880XXXX','Delhi')
  Insert into tblEmp values('3','Mahesh','Kumar','888880XXXX','Chennai')
  Insert into tblEmp values('4','Ganesh','Panday','888880XXXX','Mumbai')
  Insert into tblEmp values('5','Naresh','yavad','888880XXXX','Buxar')
  Insert into tblEmp values('6','Ram','Kumar','888880XXXX','Patna')
--duplicate table
CREATE TABLE [dbo].[tbldupEmp ]
       (
       [EmpID] [nchar](10) NULL,
       [FName] [nchar](50) NULL,
       [Lname] [nchar](30) NULL,
       [Mob] [nchar](10) NULL,
       [Address] [nchar](100) NULL,
       [NumberOfRecords] int
        )


Let’s start
Taking data flow task


Double click on the Data Flow task
Take source as flat file
Create the file connection. See the flat file is comma (,) delimiter so here you select the comma delimiter.


Click ok. Now Map the columns.


Click on ok. Now file connection has been created.
Now I am taking Aggregate transformation.


Double click on this transformation.
Now checked the all columns for find the unique records. You can rename the Count Output name. Here I am renaming the Output column name is NumberOfRecords.


Click Ok.
Now I am taking conditional split transformation. Here I am checking if the NumberOfRecords = 1 then it means that record is unique otherwise records are duplicate.


Double click on Conditional split


Click ok.
Taking two destination one for unique record and second is for duplicate records.


Creating a connection and mapping the columns for both destinations.


Now package is ready to run.
Before run the package table value
 

Let’s run this package and see the result.


Oh!!! Package is completed successfully.
Flat file data


Let’s see the records in table


Hope this will help you.
Thanks!!!



No comments:

Post a Comment

If you have any doubt, please let me know.

Popular Posts