Posts

Key Change Function in Abinitio Roll Up

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: How Key Change Function works in Rollup in_date|amt 01012023|49 03012023|32 08012023|90 17012023|91 21012023|92 02022023|93 27022023|93 21022023|94 12042023|95 14042023|96 19042023|97 21042023|98 18062023|99 26062023|81 12062023|82 13072023|83 16072023|84 31072023|85 accululate the amount for the same week. out:key_change(in1,in2)= begin let date("YYYY-MM-DD") converted_dt1= (date("YYYY-MM-DD"))(string_concat(string_substring(in1.in_date,5,4),'-',string_substring(in1.in_date,3,2),'-',string_substring(in1.in_date,1,2)); let date("YYYY-MM-DD") converted_dt2= (date("YYYY-MM-DD"))(string_concat(string_substring(in1.in_date,5,4),'-',string_substring(in1.in_date,3,2),'-',string_substring(in1.in_date,1,2)); out::if(date_week_of_year(converted_dt1) != date_week_of_year(converted_dt2) or l_year1!=l_year2) 1 else 0; end; out:r...

Expanded Rollup in Abinitio

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: Expanded Rollup Example out::input_select()= begin out::if(in.regeion=='EMEA'); end; out::key_change(in1,in2)= begin out::if(in1.work_date>in2.work_date) end; type temporary_type= record decimal("") count; end; temp::initialize(in)= begin temp.count::0; end; out::rollup(temp,in)= begin out.count::temp.count+1; end; out::finalize(temp,in)= begin out.id::in.id; out.work_date::in.work_date; out.count::temp.count; end; out::output_select(in)= begin out::if(in.count>4); end; out::output_without_input()= begin out.count::0; end; parameter called output_without_input  [At component Shutdown, At each checkpoint, At each computepoint,Never] output_at_event(); output_for_error(), make_error(); log_error(); final_log_output(); ---------------------------------------------------------------------- Fore more Abinitio, AWS and data engineering videos please subscribe , view , l...

AWS Integration with Abinitio | How to access AWS S3 using Abinitio

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: AWS Integration with AI Storage - S3, DB (RDS , Redhsift) How CO>operating System connects to the cloud object storage S3 Configuring access to Amazon S3 1. JAVA , java 8+ 2. Configure access credential for the respective cloud store       a. default credential provider chain b. aws credential profile file with named profile  c. Multiplexer file Multiplexer      .  to interact with apachee hadoop copatible file system .  to calculate the resource required for containerized jobs  AB_JOB_MUX_CONFIGURATION AB_MUX_URI_SCHEMES a. default credentials In Linux ~/.aws/credentials In Windows C:\Users\Username\.aws\credentials ---------------------------------------------- AWS Credential profile file with named profiles [default] aws_access_key_id=<default-access-key> aws_secret_access_key=<default-secret-key> [custom] aws...

How to get 2nd Highest Salary in Abinitio | get nth highest salary

Image
Watch my YouTube video for explanation : How to get 2nd Highest Salary in Abinitio | get nth highest salary | group processing in abinitio Please look into the class notes here for your references: how to get 2nd highest value in group by customer id? I/P -> SORT within Group {Customer id and minor-key Amount desc)->     -> SCAN and create one more field as sequence id(output-select)->O/P INPUT ==>  SORT ==> SCAN (Key=Customer id) ==> Output type temporary_type= begin decimal("") sequnce_id; end; temp::initialize(in)= begin temp.sequnce_id::0; end; out::scan(temp,in)= begin out.sequnce_id::temp.sequnce_id +1; end; out::finalize(temp,in)= begin out.*::in.*; out.sequnce_id::temp.sequnce_id; end; sequence_id 1  - for the max trasaction value 2  - 2nd highest 3  -  4  - out::output_select(in)= begin out.*::if(in.sequnce_id==2)in.*; end; Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and sha...

AWS Project 1

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: AWS Redshift KInesis Redshift  - Done video as named "AWS Project 1" https://aws.amazon.com/blogs/big-data/stream-transform-and-analyze-xml-data-in-real-time-with-amazon-kinesis-aws-lambda-and-amazon-redshift/ Lambda Function ====================================================== from __future__ import print_function import base64 import json import boto3 import os import time import csv  import sys from xml.etree.ElementTree import XML, fromstring import xml.etree.ElementTree as ET print('Loading function') def lambda_handler(event, context):     output = []          print(event)          for record in event['Records']:         payload = base64.b64decode(record["kinesis"]["data"])         parsedRecords = parseXML(payload)         print(parsedRecords.decode("ut...

Abinitio Question #13 | Rank() function implementation in Abinitio

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: SELECT deptno,        ename,        sal,        RANK() OVER (PARTITION BY deptno ORDER BY sal) "rank" FROM   emp WHERE  deptno = 30;     DEPTNO ENAME             SAL       rank ---------- ---------- ---------- ----------         30 JAMES             950          1         30 WARD             1250          2         30 MARTIN           1250          2         30 TURNER           1500          4         30 ALLEN    ...

Abinitio Interview Question # 17 | Dynamic Parameter in Conduct It

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: I have 100 records in input and I want o/p like in  first run first 25 records will go to out port and  2nd run 26 to 50 recs will go to output and 3rd run 51 to 75 and 4th 76 to 100 recs shall go to o/p.   How to achieve this? a. plan approach  plan   plan parameters       IN_DATA  =0  (make it dynamic parameter)       INCREMENT_VALUE=25  Looping plan       loop_type=while loop loop_condition=$IN_DATA<100     Use plan dynamic parameter so that the parameter value can be referenced  as well as altered. Amend it to make  graph task-> Input ----> FBE-----> Output[(file:$AI_SERIAL/${LOOP_VALUE_COUNTER}.dat] use package calculate let int value= next_in_sequence();  out.id::if(value>$IN_DATA and value<=$IN_DATA+INCREMENT_VALUE}) 1 else 0; program t...

Abinitio Interview Question #15 | Lowest and Highest value implementatio...

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: SELECT deptno,        ename,        sal,        MIN(sal) KEEP (DENSE_RANK FIRST ORDER BY sal) OVER (PARTITION BY deptno) "Lowest",        MAX(sal) KEEP (DENSE_RANK LAST ORDER BY sal) OVER (PARTITION BY deptno) "Highest" FROM   emp ORDER BY deptno, sal;     DEPTNO ENAME             SAL     Lowest    Highest ---------- ---------- ---------- ---------- ----------         10 MILLER           1300       1300       5000         10 CLARK            2450       1300       5000         10 KING             5000...

Abinitio Interview Question #12 | Calculate Min Max values of the dataset

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references:  I have a data file (deptid,empid,salary) I need to geneate o/p  deptid, empid, maximum salary,min salary Design the graph  input -> rollup [key-deptid] (max,min) -> output Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Question #16

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: field1 field2 field3 2 a 10200 4 b 10200300400 2 c 10200 Normalize output field1 field2 a 1 a 2 b 1 b 2 b 3 b 4 c 1 c 2 out::length(in)= begin length::length(string_filter_out(in.field3,'0')); end; out::nomalize(index,in)= begin out.field2::in.field2; out.field3::string_split(in.field3,'0')[index]; end; Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Question # 14 | Dense_Rank() implementation in abinitio

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: SELECT deptno,        ename,        sal,        Dense_rank() OVER (PARTITION BY deptno ORDER BY sal) "rank" FROM   emp WHERE  deptno = 30;     DEPTNO ENAME             SAL       rank(Dense Rank) ---------- ---------- ---------- ----------         30 JAMES             950          1         30 WARD             1250          2         30 MARTIN           1250          2         30 TURNER           1500          3         30 ALLEN...

Abinitio Interview Question # 10 | output_indexes versus output_index

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: Reformat ,If I do output_indexes and output_index at the same time then  what will happen? out:output_index(in): begin out:1: if(in.region=='emea') 1; out:2: if(in.region=='apac') 2; out:3: if(in.region=='na') 3; out:3: 0; end out:output_indexes(in): begin out:1: if(in.region=='emea') [vector 1,0]; out:2: if(in.region=='apac') [vector 2,0]; out:3: if(in.region=='na')   [vector 3,0]; out:3: [vector 0]; end Is it possible? If not when and what error would be displayed? yes , no Error as such ,but output_indexes takes precedence over the other Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Question #11 | Abinitio Optimal Design

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: I have a scenario having 3 tables,  A(country_id,c_name),  -  country   B(country_id,state_name,state_id),  - state C(population,city_name,citi_id,state_id) - city C>>>>B>>A The outpput should contain C_name,total-population, Please design the optimal graph Ans1. 3rd as INPUT FILE-->REFORMAT(2nd as lookup)------->Reformat(1st as lookup)--->Rollup---->Output Ans2. 3rd as INPUT FILE-->Reformat(2nd and 1st lookup one by one)----->Rollup---->Output Ans3. 3rd as INPUT FILE(driving port)--->    -->Join with 2nd--->reformat(lookup 1st) --->Rollup---->Output 2nd as INPUT FILE 2     ----------> Answer4---> 3rd as input--->Rollup---->Reformat(lookup 2 and 1st)---->Rollup---->Output Summary: Many a times more no of components solve the problem in an optimal ways if used ...

Abinitio Interview Question #9

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: Select= next_in_sequence()<10   out:reformat(in): begin out.id::next_in_sequence(); end; If Input is 20 records then how many records will go out? id = 2,4,6,8,10      Only 5 records will be displaying 1 2 3 4 . . 20 Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

3 Tier Architecture In AWS | Demo of 3-tier web app in AWS with VPC, ALB...

Image
What is  3-tier architecture in AWS: The display layer, the business logic layer, and the data storage layer are the three logical tiers that are divided up into the application in a three-tier design. A client-server application with a front end, a back end, and a database, like a web application, uses this design. Each of these tiers or layers can be handled independently of the others and performs a particular function. This is a departure from the monolithic approach of application development, in which the frontend, the backend, and the database are all housed in the same location. AWS is a cloud computing platform that offers consumers a variety of cloud computing services. Click on this link to see a list of all AWS services and goods. In this post, we'll use the Elastic Compute Cloud (EC2), Auto Scaling Group, Virtual Private Cloud (VPC), Elastic Load Balancer (ELB), Security Groups, and Internet Gateway AWS services to plan and create a three-tier cloud infrastructure. Our...

Hybrid Connectivity in AWS

Image
Watch my Youtube video while explaining the hybrid connectivity i.e. the connectivity on-premises (remote) desktop machine to the AWS network. The Steps and Class notes to configure the AWS and practical hands on lab. Step1. Provision AWS Workspace remote windows desktop Open AWS management console and search for AWS workspace  a. Choose Quick Setup a.1 create user username firstname lastname email  b. standard windows 10 option (free tier eligible)  c. Copy the link and paste in Firefox browser that will take you to        Install Client  for example windows d. down load workspace software client for windows e. Open workspace software add registration code in workspace software f. login to workspace software using username and password g. This will open your windows 10 desktop as a client desktop h. create certificates    https://docs.aws.amazon.com/vpn/latest/clientvpn-admin/client-authentication.html Go to mutual authentication i. Ope...

LIMIT and Ramp in AbInitio Transform Components

Image
Watch my youtube video for the explanation of the reject and limit/ramp explanation in REFORMAT component. Please check here my class notes  The Processing Algorithm for records processing and reject events:

$,${},PDL, Shell, DML Resolutions in Abinitio

Image
The Class Notes: $ and ${} -  PDL  -  1. can reference parameters $, ${} 2. $()  3. $[] in-line DML - can reference functions AB_DML_DEFS   $AI_XFR/custom_date_fns.xfr param_test    $[let string("") x= date_to_str(now1());] DML Resolution: When we need to use metaprogramming functions Shell param1 =    $()  ==> PDL

EME Commands in Abinitio

Image
Watch the YouTube video for explanation on EME and EME set up and Commands to interact with EME The Class Notes as below: Useful EME Commands technical repository  EME  tasks  :           check outs check ins tagging promotions Admin Activities: creating repository adjusting user permissions breaking locks Pre-requisite for running air commands export     AB_HOME= export PATH=$AB_HOME/bin:$APTH m_env -v export AB_AIR_ROOT= export AB_AIR_HOME=   air ls directory / 6 elements       directory: Data Profiler         direcotry:abinitio

Write Excel Data in Abinitio or Write Excel Flow in Abinitio

Image
Watch the explanation on how to write excel data in Abinitio Class notes: DML : utf8 record= begin string(‘\t’) sheet; string(‘\0’) line; end  Create Data num_records=2 out.sheet::if(index==0) 'customer' else 'revenue'; out.line:: if(index==0) 'customerid\tzip\taddress' else 'revenueID\trevenueAmt\trevenueDate' Customer Reformat out.sheet::'customer'; out.line::"'"+ "\t" + in.customerid              + "\t" + in.zip + "\t" + in.address + "'"; Revenue Reformat out.sheet::'Revenue'; out.line::"'"+ "\t" + in.revenueID              + "\t" + in.revenueAmt + "\t" + in.revenueDate + "'";   Parameters: xls:$AI_SERIAL/excels/relationsheep.xls write-mode:newworkbook/apend-records utf8:True date-format:yyyy-mm-dd autosizelumn:True record-delimiter:\0 use-format:True format-attribute:BLD,NUM format-attribute-error:True ...

Read Excel Data in Abinitio | Read Excel Spreadsheet

Image
View the explanation for the Read Excel Data in AbInitio. a. READ EXCEL SPREADSHEET .xls,.xlsx,.xlsm Parameters xls : $AI_SERIAL/abc.xls sheet_selection :by index/by_name/all sheet_index:2 sheet_header_row:1 sheet_name: utf-8:True record_delimiter:\0   java_home:$AB_JAVA_HOME skip-empty-rows:True last_row: last_column: date_fromat:YYYY-MM-DD H24:MI:SS:NNN java-options: record string(‘\t’) sheet; string(‘\t') customer_id; string(‘\t') revenue_date; string(‘\t') revenue_amt; string(‘\t') business_unit; string(‘\t') isin; string(‘\t') sedol; string(‘\t') account_id; string(‘\0') rest_of_line; end; Reformat: include '~$AB_COMPONETS/Interchange/Excel/excel_to_dml_date.xfr out::reformat(in)= begin out.*::in.*; out.revenue_date::excel_to_dml_date(in.revenue_date); out.revenue_amt::(decimal("\t"))in.revenue_amt; end; How to create Output Record Format: a. Manually b. use generate TAB c. Use excel_gendml.ksh in built Utility b. Read Excel Flow Why ?...

Component Folding in Abinitio

Image
Why? . for resource conservation specially memory . streamlining processes . improve startup time Pre-requisites of foldable components? 1. Same phase 2. Same layout 3. Must be fed from same origin component 4. Must be foldable unlike unsorted join 5. Connected by straight flows Why? . for resource conservation specially memory . streamlining processes . improve startup time Pre-requisites of foldable components? 1. Same phase 2. Same layout 3. Must be fed from same origin component 4. Must be foldable unlike unsorted join 5. Connected by straight flows examples: micrographs, transaction group How to set it up? Settings->Grapg Settings->Script->Fold Components Benefits? . less no of processes save overall starup time, memory consumption, time,scheduling overhead etc . less interprocess communication Drawbacks? . loss of pipeline parallelism . internal buffering Where to use: . micrograph . continous graph . transaction group

API Gateway KDS KDF S3 Redshift

Image

XML SPLIT | Reading XML Data | Working with XML data in Abinitio

Image
Steps Class Notes: XML processing in AI  1. Parse the XML    2. Generate the XML data   1. Parse the XML    a. Get the XSD or Sample XML file(XML exemplar) having full xml schema        b. xml-to-dml utility/command to generate the DML             OR       use import XML dialog using the GDE component's port TAB        c. Read the XML - you will See Records and Vector Structures         d. Reformat /Normalize the data based on parent node (base element)    e.  Apply different AI components to solve the use case  Components used to read the xml data       xml split

Abinitio Interview Question # 1 - Write Multiple Files in Abinitio

Image
 Question :  I get records of various region in huge volume ,  we want to write the records in region specific files.   How to achieve that? View the explanation in the below video: Answer: -WMF WMF: filename::get_filename(in)= begin let string("") l_filename=string_substring(in.data.region); filename::string_concat($AI_SERIAL,"/",l_filename.dat); end; out::reformat(in)= begin out::in.data; end; OUTPUT DML-: DML record string(int) filename; string(integer(4)) file_contents; end; RMF: filename::get_filename(in)= begin filename::string_concat($AI_SERIAL,"/",in.filename); end; out::reformat(read,filename,in)= begin out.file_contents::read; out.filename::filename; end; Please view my youtube channel  DataPundit

API gateway integration to get method and post method to retrieve/insert data in DynamoDB using Lambda

Image
API gateway integration to get method and post method to  retrieve/insert data in DynamoDB using Lambda Watch the following video for hands on using AWS management console. API Gateway (Get/Post)----> Lambda ----> DynamoDB Integration 1 Get method integration in API Gateway to retrieve information from the DynamoDB table   a.       Create an IAM role Create an assumable IAM role. Following trust policy attached to it {   "Version": "2012-10-17",   "Statement": [     {       "Sid": "",       "Effect": "Allow",       "Principal": {         "Service": "apigateway.amazonaws.com"       },       "Action": "sts:AssumeRole"     }   ] } The role must have invokefunction permission   {   ...

Check In(Import in TR) and Check Out(Export from TR) in Abinitio

Image
Click Here

Abinitio Phases and Checkpoints

Image
Abinitio Phases and Checkpoints | Why Phases | Abinitio Checkpoints | Recovery and Resource Control Phases  and Checkpoints mechanism in Abinitio works on below tradeoffs: Speed/control  - More control makes graph slower  Speed /safety  - more safety (recovery point of view) makes graph slover primary objective of creating phases in graph is to efficient use of available resources such as CPU, main memory and disk space. phase break :  the boundary between 2 phases in called phase break. first phases does complete and then only the 2nd phase can be started. When the first phases completes, the Abinitio component immediately before the phase break  writes down all the data passing through it to the temporary files  under the layout of the component immediately after the phase break, the component after the phase break reads all these files to begin the next phase. check point: a point where Co operating system saves all the information it would need to...