The Row Sampling transformation is used to obtain a randomly
selected subset of an input dataSet. We can specify the exact size of the
output sample, and specify a seed for the random number generator. For example
just I want to select the 100 rows from a table for sampling for analysis then
we use Row sampling transformation.
Let’s see the example of Row sampling
Open SSDT and take Data flow task
Double click on the Data flow task and the Data flow pane
takes source as OLEDB
Now configure the required configuration of OLEDB Source
Click ok.
Now I am taking the Row sampling transformation
Now double click on the Row sampling transformation. We will
get the Row Sampling Transformation Editor.
Select the
number of rows which we want for sampling.
We can give
the Sample output name as well as unselected rows (not sample).
Leave the
Random seed check is unchecked. If we checked it then always it will show the
same records.
Now click on
the Columns for selecting the columns.
Click ok.
Row sampling transformation configuration is done.
Now I am taking Multicast transformation to see the Row
sampling.
Select the Output as sampling selected Output.
If we want to see the row of unselected rows. we take
another multicast transformation to see the result.
Click ok.
Now package is ready to run
To seeing the result I am using Data viewer
See the result for selected Row sampling the output will be
the 50 rows
Package executed successfully.
Note
Row Sampling is full blocking transformation. If your source
is OLEDB then it will be better to use sql and get the random rows from the
backend .it will save the time. Performance will be increase.
It will good when you are using source as Flat files, CSV
files.
oracle rac online training
ReplyDeleteoffice 365 online training