Sponsor
In this tutorial we will learn a new SAP Data services transform, known as Table Comparison Transform and we will see how we may use this transform to implement "slowly changing dimension" (SCD) Type - I. Like before, we have added a video tutorial at the end to perform a little hands-on.
This is one of the SAP BusinessObjects Data Services native transform (For a full list of all transforms, you can see details of SAP Data services transforms) that helps to perform a fundamental ETL operation - comparing two data sets.
Comparing two data sets, this transform can generate the difference between them as a resultant dataset with each row of the result flagged as INSERT, UPDATE, or DELETE.
While loading data to a target table, this transform can be used to ensure rows are not duplicated in a target table and hence is very helpful to load a dimension table.
In the video below we have demonstrated how to implement SCD - I. We strongly encourage you to watch the video along with reading this text. The basic logic for implementing SCD-I is, we will compare each record coming from source to the records in target and if the record exists in target, we will update the target record with the values from the source record. However, if the record does not exist, we will insert the record to target.
"Table Comparison Transform" help to compare the source records with target. You have to provide the source data (input schema) and the columns based on which you want to perform the comparison. Generally we specify the primary key column (or all the columns forming the composite primary key) so that it can perform the comparison based on the primary key column. Further to this, we can also specify "Compare Columns" - which is a list of columns that we intend track the changes of. As an example, let's say we do not want to update a customer's record even if a certain column has changed in the source and only want to update the target record if some other columns (e.g. column X, Y and Z) have changed. In that case, we have to put column X, Y and Z in "Compare Columns" list.
There are three methods for accessing the comparison table namely Row-by-row select, Cached comparison table and Sorted input. Below is the brief on when to select which option.
Saurav Mitra is a business intelligence professional.
Connect with the author via Google+