Abinitio Interview Question 2 | parallel processing | Multiple processing
Watch my YouTube video for explanation :
Abinitio Interview Question 2 | parallel processing | Multiple processing
Please look into the class notes here for your references:
How 100 multifiles/serial will be processed simultaneously
using Ab initio.
a. we need to read 100 files/mfs
b. We need to write 100 files/mfs
c. any other multi processing
revenue_file_apac.dat
revenue_file_nam.dat
.
.
.
revenue_file_sa.dat
Approach -
1. we can try creating plan
a. write a generic graph
INPUT-->PROCESSING LOGIC -->OUTPUT
create pset -
DML_NAME,INPUT_FILE_NAME,OUTPUT_FILE_NAME
b. create a plan
vector of files
FILE_VEC=directory_listing("$INPUT_FILE_PATH","revenue_file*.dat");
. For each value Loop -
LOOP_VALUE_VECTOR =FILE_VEC
LOOP_CONCURRENT=false/true
AB_PLAN_LOOP_CURRENT_VALUE
DML_NAME,INPUT_FILE_NAME,OUTPUT_FILE_NAME
revenue_apac_rec_format.dml
OUTPUT_FILE_NAME=string_concat($OUTPUT_FILE_NAME,"_",".dat")
2. create n # of psets - Call those psets using job scheduler
3. read and write multiple files/multiple multifile
Read Multifile
Write Multifile
4. Parallel procesing - looping plan - serial/concurrent
While Loop
For Loop
For Each Loop
Subscriber Loop
Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel
Click DataPundit
Comments
Post a Comment