Business Magazine

ETL Testing and Teradata Control Framework Widely Called as TCF Framework

Posted on the 28 March 2014 by Asik Ali @Asikali077

Hello All,

Hope you have tested yourself on ETL Testing skills @ theoretical level at least. In this post let me explain about the Teradata Control Framework and the significance of testing role in it.

I am thankful to all the viewers from 52 countries and I request you people to comment how my blog is useful to you guys and is it Good / Bad / Ugly ?

The basic difference between conventional ETL Framework and TCF framework is, TCF is fully automated. Everything that we wanted will be given by a macro written in Excel sheet. TCF is very much helpful for all the developers because it reduces the coding effort.

TCF has three patterns and based on our requirement we need to choose which pattern will fit for our solution.

  1.    Full Apply Patter
  2.    Delta Apply Pattern
  3.    Insert & Append patter

The solution architect will decide what pattern they want to follow and cascade them to Data designers.

  •   If your target tables are not time variant and you are going to have only inserts then Full Apply    pattern will be applicable.
  •   If your target table is time variant then Delta Apply pattern will be applicable
  •   Insert & Append will be applicable for the target tables where you will not do delta, just append the new values upon the old values and just inserts

We as tester’s needs to understand the patterns and test the patterns are working as expected. We have Oracle control tables to maintain and control the ETL process likewise we have many control tables in TCF to maintain and control the ETL Process.

We have around 6 columns will be loaded by this TCF control tables in the matter of maintain the TCF process. We as testers need to verify these column values are populated from corresponding control tables.

In conventional ETL architecture we have [File to Staging table to Intermediate table to Target table], and we will maintain VARCHAR in all Staging and Intermediate tables but in TCF we will have [File to Staging table-1 to Source Image table to Intermediate table to target table]. In staging table we will do the Type Casting – changing the Data Types as per the Target table. So in TCF except Staging tables we will not keep all the columns as VARCHAR. As testers we need to verify the data type conversion is achieved as per the specification. These Data typing SQL are part of the TCF and will be generated using the TCF macros.

Unknown records are now will be updated in the Source Image tables only, so from Source Image table to IM table transformation, we do not want to verify for Null, Blank, Space, 0s.

  1.  Source Image table DDLs are generated by the Excel macro, so we need to check all the data types and the length of any columns are as per specifications.
  2.  We need to ensure the ETL is working as per the Pattern chosen – like we chose Fully Apply and if  the records are loaded for Delta Apply then it’s a mistake.
  3.  In conventional staging tables we do not keep historical records, but if you would have chosen delta apply pattern then Source Image tables will have Expiry and Inserts.
  4.  In TCF, source Image tables the copy records are treated as Delta records in Delta Apply patterns
  5.  If you are testing Soft Delete (Logical Delete) then please refer a column called – Record Delete Flag = 1
  6. In conventional ETL, if you do logical delete then you will keep Expiry Date= ETL Date – 1, but in TCF we will have the record active based on  EFFT_D and EXPY_D by record deleted flag will be set to 1
  7.  Date and Time validations will be done in Source Image tables only, if you send invalid Date or Time the Type Casting SQLs will load as Null in source image tables.
  8. As the Source Image tables are not SET tables, we need to run queries to get the duplicates out of it.
  9.    We cannot use TCF to load the data to Existing tables

Cheers – Asik

 


Back to Featured Articles on Logo Paperblog