Below are the major difference between Merge and Union All transformation.
Ø Merge Transformation always accepts sorted input data. For union all no need to sort input.
Ø Merge Transformation only accepts two inputs. But Union all accepts multiple inputs.
Ø Merge transformation will produce sorted output but union all give unsorted data.
Let’s see the example.
Merge Transformation
Taking Data flow task.
Double click on the data flow task.
In Data flow, I am taking two input source and configuring both sources.
Now I am taking Marge Transformation.
Now I am taking Merge Transformation and merging both sources into merge transformation
It is throwing an error.
It asks for sorting.
Now I am using sorting transformation. Let’s see.
Configure the sort transformation. And also configure the Merge transformation.
I am taking Multicast transformation to see the result.
Now I am executing this package.
See the data. It is in sorted order.
Let’s see if we use more than two inputs in Merge Transformation.
If both sources are the databases it will be better to sort the data in the database itself. It will be faster because all we know that Sort transformation is full blocking transformation. When we short data in the database also we getting the same error to sort data before Merge transformation. To overcome that issue we need to enable the IsSorted properties in the oledb source.
When we adding 3rd input in Merge transformation we are getting below error.
We can use only two inputs for the Merge transformation.
Union All Transformation
It will take multiple inputs. It accepts unsorted inputs.
Taking Data flow task
Double click on the data flow task. Configuring the inputs source.
No need to sort input data.
Now I am executing this packing. See the output.
Not sorted output.