Chapter 12 – Know About Data Types and ETL Testing

Posted on the 13 December 2013 by Asik Ali @Asikali077

Hi All,

Have you created , Updated , Deleted Face book account to know about DWH concepts ? Today in this post let me explain you what is the necessity and importance of the Data types in ETL Testing.

We will start with our known examples :

Can you fill 10 liters of water into a 5 liters container?

“No, the container can have only 5 liters of water, if you fill more than its capacity then it will burst :-(“

Can you use Salt instead of sugar to make Tea?

“No, then every one will stop drinking Tea :-(“

Can we name a Kid using Numbers?

“No, if we keep numbers , then how many duplicate persons exists in this world ? just Imaging if  I was named as 10215 !!!!

Can anyone have their Bank balance as absolute number ?

“No, because every money that you spent is fractional amount ! you cant have $ 5 all the time , it would be $ 5.28 most of the time.

Like the above example

Our Source files, Source tables , Target tables are constructed with limitations. You cant have or keep the data that you want. You can keep or have the data that what system can accept.

In every programming we have this data types , most of them who reads this post knew about basics of Data types.

INTEGER, CHAR, VARCHAR, DECIMAL, FLOAT etc.,

Most of the time developers are testers encounters problems because of the data typing in Data warehouse world are ,

1. Correct Data type is not chosen in Source tables

2. Correct length of the data is not received from the source system in the source file

3. Source is not giving the values as per the data types mentioned

4. Data got truncated when loading it into Target tables.

5.The amount column precision is not populated correctly as Teradata changes it to make round off value.

6.Regarding Dates, source will send them as var-char but when we load it into target tables we keep as DATE and the format

The Data type and its length will be designed it its  DDL – Data Definition Language . If you want to know about the tables properties then please use the blow query

a) “ SHOW TABLE Database.Table_Name ” – this will give you all about data types, data length. Not Null, Null, Primary Key definitions

b) ” HELP TABLE Database.Table_Name” – this will give you all about the table. 

As a Tester what we need to verify ?

Again as I said,

Check the data is matching with the data type mentioned in the spec.

Check any data truncation happened when source data is loaded into Staging tables

Check the data is the staging tables are as per the Staging tables DDL

Check the target table columns are loaded as per the Target tables DDL.

If it a Varchar columns from source ,then please take care of the space , invalid characters etc., right from source till staging tables, because data stage will not accept special characters

If its a Decimal column then make sure the precision is carried out till the target load

If its a Integer column then make sure you should not get blanks ans spaces from source

If its is a CHAR then check the length of the character that fit into this column

Like above we can add as many as much scenarios for Data Type verification.

Hops this blog helps you to understand what is the importance of the Data Types in ETL testing,

See you in my next post

Cheers - Asik


Filed under: Data Warehouse Testing - Learners Guide Tagged: Data definition language, Data type, Data warehouse, Null, Tables, Target Corporation, Teradata, Varchar