It is similar to Row sampling but it gives the % output of
the Source records. Suppose in source having the 600 records if we use 10% of
row sampling then it will return the 60 rows but if we use row sampling then it
will return the fix number of the row.
Let’s see the example of Row sampling
Open SSDT and take Data flow task
Double click on the Data flow task and the Data flow pane
takes source as OLEDB
Now configure the required configuration of OLEDB Source
Click ok.
Now I am taking the percentage sampling
Double click on the transformation
Select the
Percentage of rows which we want for sampling.
We can give
the Sample output name as well as unselected rows (not sample).
Leave the
Random seed check is unchecked. If we checked it then always it will show the
same records.
Taking the Multicast to see the result
Select the sampling selected Output. Selected Unselected
output is optional.
To seeing the result I am enabling the Data viewer.
Now it will be ready to run
See the total records on the table is 606 and the 10%
sampling showing 69 (result will be variable so no worries)
Get the expected result.
Note
Percentage
Sampling is full blocking transformation. If your source is OLEDB then it will
be better to use sql and get the % rows from the backend .it will save the
time. Performance will be increase.
It will good
when you are using source as Flat files, CSV files.
No comments:
Post a Comment
If you have any doubt, please let me know.