How to implement SCD type 2 in Abinitio



Lets take an example dataset to understand the SCD type 2 in Abinitio.

For example . we have an input dataset which is already loaded in as dimension table already.

vehicle dimension dataset

sk, veh_cd,     curr_driver,hist_driver,start_dt,         end_dt,             curr_flag
1, veh01, dr0,  dr0,            19000101000000,39991231235959,Y
2, veh02, dr1,  dr1,            19000101000000,39991231235959,Y
3, veh03, dr2,  dr2,           19000101000000,39991231235959,Y
4, veh04, dr3,  dr3,           19000101000000,39991231235959,Y
5, veh05, dr5,          dr4,           19000101000000,20210220235959,N
6, veh05, dr5,          dr5,           20210221000000,39991231235959,Y

Sk is Logical Surrogate Key created by Abinitio Process
veh_cd - vehicle code of the driven vehicle
curr_driver - current driver of the vehicle
hist_driver - driver who drove the vehicle in past between start_dt and end_dt
curr_flag - Y/N , based on the current record of the vehicle

curr_flag=Y meaning the vehicle is currently driven by the curr_driver
curr_flag=N, meaning the vehicle was driven by hist_driver between start_dt and end_dt

Now, 

lets say we received the current dataset from the source as below:
we will not have hist_driver and curr_flag in the incoming dataset.
Assumption is that we received the current dataset on 5th June 2021.

sk, veh_cd,curr_driver,start_dt,         end_dt,                     
1, veh01, dr0,    19000101000000,39991231235959      
2,  veh02, dr1,    19000101000000,39991231235959
3, veh03, dr2,    19000101000000,39991231235959
4, veh04, dr3,    19000101000000,39991231235959
5,     veh05,     dr6,                     19000101000000,39991231235959 

note: do not gain any meaning of the start_dt,end_dt and curr_flag columns in the current dataset, as 
this data set was prepared to match the layout of the dimension dataset.


Observation:
we see that we received a updated record for the veh05 with driver as dr6.
so we need to close the record #6 of the dimension dataset and insert an new record  with the dr6 as current driver of the vehicle.

Our Target dataset will look like as below:


sk, veh_cd,curr_driver,hist_driver,start_dt,         end_dt,             curr_flag
1, veh01, dr0, dr0,    19000101000000,39991231235959,Y
2, veh02, dr1, dr1,    19000101000000,39991231235959,Y
3, veh03, dr2, dr2,    19000101000000,39991231235959,Y
4, veh04, dr3, dr3,    19000101000000,39991231235959,Y
5, veh05, dr6,         dr4,    19000101000000,20210220235959,N
6, veh05, dr6,         dr5,    20210221000000,20210604235959,N
7,     veh05,     dr6,          dr6,     20210605000000,39991231235959,Y 

we also update all values of  curr_driver for historical records with the new driver.


the design of the Abinitio Graph will look like as below:


input port0: $AI_DML/vehicle.dml  ==>rec1
input port1: $AI_DML/vehicle_incoming.dml===>rec2

Output port of JOIN component:

PDL :  $[join_types(in0_metadata,in1_metadata)]

Join output will be passed through the REFORMAT component:


out:output-indexes(in)=
begin
let date(YYYYMMDD") HIGH_END_DATE ="39991231235959";
out:1: if(in.rec1.curr_driver!=in.rec2.curr_driver and in.rec1.end_dt!=HIGH_END_DATE) [vector 1];
out:2: if(in.rec1.curr_driver!=in.rec2.curr_driver and in.rec1.end_dt==HIGH_END_DATE) [vector 2,3]
out:: [vector 0];
end;


transform0 (no update, exact match )

out:reformat(in)=
begin
out.*::in.rec1.*;
end;


transform1 (update the historical records with curr_driver value)

out:reformat(in)=
begin
out.*::in.rec1.*;
out.curr_driver::in.rec2.curr_driver;
end;


transform12 (close the current record)

out:reformat(in)=
begin
out.*::in.rec2.*;
out.hist_driver::in.rec2.curr_driver;
out.end_date::now1();
out.curr_flag=N;

end;

transform13 (create the new record)

out:reformat(in)=
begin
out.*::in.rec1.*;
out.hist_driver::in.rec2.curr_driver;
out.start_dt::now1();
out.end_dt::HIGH_END_DATE;
out.curr_flag::Y
end;

Collect the output of transform0 to TRASH
Collect the output of transform1,2 to Update the Dimension TABLE.
Collect the output of transform3 to insert into the Dimension TABLE.


Please also view my follow-up video on SCD type 2 
SCD type 2 dimension video for more metaprogramming detail:








Comments

Popular posts from this blog

Abinitio Interview Question # 1 - Write Multiple Files in Abinitio

Next In Sequence in ABinitio | How next_in_sequence() works in MFS