Chapter 14 – A Sample ETL Project – Customer Management in a Hospital

Posted on the 15 December 2013 by Asik Ali @Asikali077

Hi All,

Have a please week ahead !!!

In this post let me explain a Project Structure.

I have a Multi-Specialty Hospital located across the world. My hospital is famous for Vaccination. Patients who comes to my hospital across globe will be given a User Card with which they can access any of my hospital in the world.

Business Need

Now we maintain all customers in one Database. There are heaps and heaps of customers related to my hospital.So I decided to split up the customers based on the country and load them into corresponding country tables. Elaborated details of this requirements will be mentioned in the Business Requirement Specification.

Design Need 

To pull the customers as per Country, a designer should know what are all the places the Customer Data is available. So the data extracting will be done by our Source System. They will pull the all the relevant customer data and will give us a Data file.

In Design Documents you will have

Source File Specification 

1)   File Name Specification – Name String , Extension of the files
2)   Date and Time format of the File – YYYYMMDD, HHMMSSTT or any other format
3)   Control File Specification - Name String , Extension of the files
5)   Header Records Layout – |H|Customer_Records|20131012|1245872|
6)   Details Record Layout – |D|Asik|123456|20101012|20121013|MVD|Sanjay|NSW|AU|06031987|A
7)   Trailer Record Layout - |T|10|

Detail Records will tell you what data you are getting from source, what data type, is it mandatory or not and the length of the column.

File Position Column_Name Filed Length Data Type Mandatory (Y/N) Key Column

1 Customer Name 255 VARCHAR Y Y

2 Customer ID 18 VARCHAR Y N

3 Customer Open Date 8 DATE Y N

4 Last Consulted Date 8 DATE N N

5 Vacination Type 5 CHAR N N

6 Doctor Consulted 255 CHAR N N

7 State 5 CHAR N N

8 Country 5 CHAR N N

9 Post Code 5 INT N N

10 Date of Birth 8 DATE N N

11 Active Customer 1 CHAR N N

Click below

Please refer my Post - Source File Validation to know how validate the above details.

The sample file format will be

|H|Customer_Records|20131012|1245872|
|D|Asik|123456|20101012|20121013|MVD|Sanjay|NSW|AU|06031987|A
|D|Alex|123457|20101012|20121013|MVD|Sanjay1|SA|USA|06031987|A
|D|John|123458|20101012|20121013|MVD|Sanjay|TN|IND|06031987|A
|D|Mathew|123459|20101012|20121013|MVD|Sanjay|WAS|PHIL|06031987|A
|D|Matt|12345|20101012|20121013|MVD|Sanjay|BOS|NYC|06031987|A
|D|Jacob|1256|20101012|20121013|MVD|Sanjay|VIC|AU|06031987|A
|D|Arun|1456|20101012|20121013|MVD|Sanjay|QA|AU|06031987|A
|D|Wague|23456|20101012|20121013|MVD|Sanjay|AOL|CAN|06031987|A
|D|Steve|126|20101012|20121013|MVD|Sanjay|NSW|AU|06031987|A
|D|Neil|156|20101012|20121013|MVD|Sanjay|NSW|AU|06031987|A
|T|10|

Now using the ETL process now we loaded the data into Staging Tables. Intermediate tables will look like below

Staging Table Load 

Click below

Please Check my ETL Process post how the data is loaded and what we need to verify from this Step :

Name Cust_I Open_Date Consul_Date VAC_ID DR_Name State County DOB Indicator

Asik 123456 20101012 20121013 MVD Sanjay NSW AU 6031987 A

Alex 123457 20101012 20121013 MVD Sanjay1 SA USA 6031987 A

John 123458 20101012 20121013 MVD Sanjay TN IND 6031987 A

Mathew 123459 20101012 20121013 MVD Sanjay WAS PHIL 6031987 A

Matt 12345 20101012 20121013 MVD Sanjay BOS NYC 6031987 A

Jacob 1256 20101012 20121013 MVD Sanjay VIC AU 6031987 A

Arun 1456 20101012 20121013 MVD Sanjay QA AU 6031987 A

Wague 23456 20101012 20121013 MVD Sanjay AOL CAN 6031987 A

Steve 126 20101012 20121013 MVD Sanjay NSW AU 6031987 A

Neil 156 20101012 20121013 MVD Sanjay NSW AU 6031987 A

Now my Staging Data is ready and we need to load them into corresponding Target Tables.

As a project we will test data load from file to Staging as Phase-I and Staging to Target tables to Phase II.

In this project we are not having any transformation rules , so we will insert the records from Staging tables to corresponding target tables.

Like

All customers related to India will go to Table_India and so on. In my next blog let me come up with few more complex examples for Target tables load.

The Source to Target Document will looks like

 Customers From India will be loaded by below logic

S.No Source Column Source Table Transformation Rule Target Table KEY / Value Target Column

1 Customer Name CUST_STG If Source value is Null Do not Load Else load CUST_IND Key CUST_NAME

2 Customer ID CUST_STG If Source value is Null Do not Load Else load CUST_IND Key CUST_ID

3 Customer Open Date CUST_STG Assign Source Value CUST_IND Value CUST_OP_DT

4 Last Consulted Date CUST_STG Assign Source Value CUST_IND Value CUST_CONS_DT

5 Vacination Type CUST_STG Assign Source Value CUST_IND CUST_VAC_TYPE

6 Doctor Consulted CUST_STG Assign Source Value CUST_IND CUST_DR_CONS

7 State CUST_STG If Source value is Null Do not Load Else load CUST_IND Value CUST_STATE

8 Country CUST_STG If Source value is Null Do not Load Else load CUST_IND Value CUST_COUNTRY

9 Post Code CUST_STG If Source value is Null Do not Load Else load CUST_IND CUST_POS_C

10 Date of Birth CUST_STG Assign Source Value CUST_IND CUST_DOB

11 Active Customer CUST_STG Assign Source Value CUST_IND CUST_STATUS

So based on above rule will load the data into IND_CUST tables.

Cheers

Asik


Filed under: Data Warehouse Testing - Learners Guide Tagged: Data file, Database, ETL, Extract transform load, Ministry of Internal Affairs (Russia), Target Corporation, Test data, United States