Monday 7 October 2019

Validate (data cleaning) the email id before loading the files in the database


We are getting the files we need to load it into the database. Before loading this file we need to validate the email address. If Email ID is correct then only we are loading that records otherwise we need to skip these records.

See the example  
  
In this example ID 1 and 4 email id are correct but 2 and 3 emails id are incorrect. Our expectation is to load 1 and 4 emp and ignore the 2 and 3 ID.
To loading this file I am taking data flow task.
 

Taking source as flat file source.


Now we need to take Script component transformation to validate the email.


Click ok.
Here I am taking the Email as Input column.


Creating Output column as vailid_email as output and it is Boolean.


Now Click on the Edit script.
public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
              
        Regex reg = new Regex(@"([a-zA-Z0-9._%+-]+)@([a-zA-Z0-9.-]+.[a-zA-Z]{2,4})");
        Row.VaildEmail = reg.IsMatch(Row.EmailID);
  

    }
     
Close the window and click ok.
Now I am taking the Conditional split transformation and creating the case when the Valid_Email column as true then it will be the correct else email id is not valid.
 

Now I am taking two multicast transformations for the test.
 

Now I am running the package.
 

Get the expected result.


No comments:

Post a Comment

If you have any doubt, please let me know.

Popular Posts