Posts

AWS Managed Streaming for Kafka

Image
  AWS Managed Streaming for Kafka (Govern how your clients interact with Apache Kafka using API Gateway)         Watch my You tube video on this: Follow the steps below to create the data pipeline: STEP1 Networking Part VPC , Public and Private Subnets, Internet Gateway, NAT Gateway , S3 end point , Route Tables , Security Groups STEP2  Launch MSK Cluster in the same VPC as created in STEP1 , unauthorized access allowed , plaintext encryption Make sure the security group remain as it is. STEP3 Launch Linux EC2 Choose the Same VPC , create in Public Subnet In the list Auto-assign Public IP, choose Enable. STEP4 Once the client for Amazon MSK has been created, the security group rules must be configured to allow the connection between the cluster and the  client ec2 machine that we have just created. For that , Add the security group id of ec2 to MSK cluster security group all traffic Repeat these steps to add an inbound rule in the securi...

Primary Keys , Foreign Keys, Identity column

Image
                    Primary Keys ,  Foreign Keys, Identity column Watch YouTube Video:                               Primary Keys ,  Foreign Keys, Identity column CREATE TABLE auto.Departments ( Id INT NOT NULL identity , Name VARCHAR (25 ) NOT NULL , PRIMARY KEY (Id )  ) ; INSERT INTO auto.Departments ( Name ) VALUES (  'HR' ) , ('Sales' ), ('Tech ' ); select * from auto.Departments CREATE TABLE auto.Employees ( Id INT NOT NULL identity , FName VARCHAR (35 ) NOT NULL , LName VARCHAR (35 ) NOT NULL , PhoneNumber VARCHAR (11 ), Managerid INT , Departmentid INT NOT NULL , Salary INT NOT NULL , HireDate DATETIME NOT NULL , PRIMARY KEY (Id ) ) ; (INSERT INTO auto.Employees  (  FName , LName , PhoneNumber, Managerid , Departmentid , Salary , HireDate ) VALUES ('James','Smith','1234567890',NULL,1,1000,CONVERT(D...

Normal Forms in Relational Database Management System

Image
  Normal Forms in Relational Database Management System: TO watch YouTube Video ,      raw-data-: name  , phone  city,  state Ram,    919876543210 , 918987654327, Banaglore,                             KA Dave,    919876543212 , 918987654328, Banaglore,                             KA Ali,    919876543213 Jaipur,                                  RJ Milkha,    919876543219,  918987654329, Bathinda,                                PA 1st Normal Form - 1. each table cell contain a single value (atomic) 2. each record ne...

Catalog Management in Abinitio , Sharing lookups in AbInitio

Image
Please watch my video on catalog management in abinitio: For class notes: Catalog Management in Abinitio Problem Statement - How to share the same lookup/or Set of lookups in different graphs 1. Sharing Lookup file with catalogs a. Graph settings -> Catalog - create lookup catalog  ,  specify the catalog path $AI_SERIAL_LOOKUP/shared_catalog.cat b. Graph settings -> Catalog - uses lookups from catalog  , specify the catalog path $AI_SERIAL_LOOKUP/catalog.cat c. Graph settings -> Catalog - does not use catalog (by default)                                                     2. Use Dynamic Subgraph ( Recommended) Catalog Management Commands: Make sure you do either of these below: AB_CATALOG=$AI_SERIAL_LOOKUP/mycatalog.cat specify -catalog <catalogurl>  in commands 1. ls -l $AI_SERIAL_LOOKUP .cat 2. m_lscatalog -catalog myca...

MIME types in abinitio | why my object is not allowed to check In

Image
Watch my you tube video below:                For Class notes: MIME types-  datapundit Why some objects developed in GDE can be checked in as is?? List Of Extention List at Project Level. *.dml  text/x-abinitio-dml *.job   ignore *.mfctl ignore *.mp  application/x-abinitio-mp air project show <eme-projet-path> air project files <eme-prj-path>  -versions -all -basedir <sandbox-path> How to modify extention list:- air project modify <Project-path-eme> -extention "extention-pattern" <mime-type> air project modify <Project-path-eme> -remove -extention <extention-pattern> How to know the mime type of an object? Apply the MIME type to a specific project. air project set-type <eme-path-of-object> <extenstion-type> GDE - While Check In For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit...

Episode 1 Why SQL is Important

Image
                                       Watch my youtube episode:                     For Class notes please visit: SQL Operations by dataPundit Episode 1.  Introduction - Why SQL operations and Major Constituents SQL (Structured Query Language) HiveQL  Spark SQL NewSQL - RDBMS(ACID) + no sql(Scalability) KQL (Kusto Query Language) -azure PartiQL -dynamodb  - aws Federated Query a.  Who Uses SQL ? SQL Developer,  DBA ,  Database Developer,  Data Analyst,  Business Analyst,  DataEngineers,  Data scientist,  Business Operation Analyst, Operation Engineer,  Sales Manager, Product Managers, Pricing Managers b. Huge Demand -  spark sql ,   no sql db,  hive sql   c. It helps understand the logic an...

ACID Properties in RDBMS | Atomicity | Consistency | Isolation | Durability

Image
                        For Video Explanation please watch my you tube video:                    For Class notes please see below: ACID : Atomicity: Either all or None T1  transfer 5000 from account a to account b read(a)  -op1 a=a-5000 - op2 write(a) -op3 read(b) b=b+5000 write(b)- system crashed, power goes,techinical or non technical commit; Consistency T1  transfer 5000 from account a to account b total money before the starting of the transaction must be equal to total money after the trasaction take place. before transaction A  - 2000 B  - 3000 T1 need to trasfer 1000 rupees from A to B After transaction: A - 1000 B - 4000 Isolation: T1    T2 parallel transactions T1  transfer 1000 rupees from A to B  T2  transfer 2000 rupees from B to C T1  T2 Durability: T1 transfer 1000 rupees from A to B  read(a)  -o...

SCAN Parameters in abinitio

Image
                                   SCAN Parameters in abinitio See my youtube video explanation: Class notes as below: Scan - functionality - coummulative summary .  key - 1) key specifier 2) key method - key_change  sorted input- false/true max_core - 64 mb  reject threshold -  maintain order - true  grouped input - true /false  major key - region id , minor key - dept id. check sort - sorted input , key method -key specifier For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Questions 34

Image
                                             Abinitio Interview Questions 34 For Class Notes please visit:                     i/p a,a,b,c,d,a,b,c,d  record string("\n") input_str; end out::length(in)= begin out::length_of(string_split(input_str,','); end; field a a b c d a b c d RollUp  o/p a,a,a b,b c,c d,d INPUT-> NORMALIZE-> Rollup(field) For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Questions 32

Image
                   Abinitio Interview Questions 32                Watch my YouTube video for explanation :         Solve Using Abinitio Input DeptID  teacher    IsAssigned D1      Teacher1   1 D1      Teacher2   1 D1      Teacher3   0 D2      Teacher1   0 D2      Teacher2   1 D2 Teacher3   0          DeptID   Teacher1    Teacher2   Teacher3 D1 1 1 0 D2 0 1 0 Input--->Rollup(DeptID) teporary type= begin decimal("") Teacher1; decimal("") Teacher2; decimal("") Teacher3; end; out::initialize(temp,in)= begin out.Teacher1::0; out.Teacher2::0; out.Teacher3::0; end; out::rollup(temp,in)= begin out.Teacher1::if(in.Teacher=='Teacher1' and in.IsAssigned ==...

Abinitio Interview Questions 31 m_eval commands

Image
Abinitio Interview Questions 31 m_eval commands Watch my YouTube video for explanation :                            Quick Commands in Abinitio 35.1 concatenation of sring m_eval 'string_concat("abc","cde")' abccde 35.2 m_eval -print-type -no-print-value 3.14159 double 35.3 quick testing of functions m_eval -include $AI_XFR/myfunctions.xfr 'getRate(1890,'JAN')' 35.4 in the context of pset cat .project.pset DML|En|||sandbox/dml m_eval -context .project.pset "'\$DML is ' +  \$DML" "$DML is sandbox/dml" 35.5 m_eval 'lookup("ProductList","908").description' SSD Drive 987 35.6 m_eval '(date("YYYYMMDD")) (today() -10)'  for example if today=20230520 20230510 For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Questions 30 M_dump commands

Image
          Abinitio Interview Questions 30 M_dump commands Watch my YouTube video for explanation :                m_dump video done m_dump file.dml mfs -start 1 -end 10 m_dump loader.dml mfile:mfs8/tempfile.dat -partition 3  m_dump loader.dml mfile:mfs16/tempfile.dat -select 'empid==90' m_dump loader.dml mfile:mfs32/tempfile.dat -print-no-data print-n-records m_dump loader.dml mfile:mfs32/tempfile.dat print-n-records m_dump loader.dml mfile:mfs32/tempfile.dat -record 12 For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Abinitio Interview Questions 29 Dynamic Layout

Image
                                        Abinitio Interview Questions 29 Dynamic Layout Watch my YouTube video for explanation :                     What is Dynamic MFS layout - video done          What is it: fixed depth variable depth dynamic mfs - build-mfs -dynamic -fixed depth dynamic layout How it is being done: fixed depth mfile:dynamic:n  OR mfile:dynamic:$DEPTH  variable depth mfile:dynamic:-1:data-path[:MB-per-partition[:max-depth]] -1 meaning its COP decided the depth of paralleism at runtime not the user Creating Dynamic Single Directory MFS: build-mfs -dynamic -singledir mfs-depth 64 -mfs-mount s3://my-bucket/mfs-64way laypout:  s3://my-bucket/mfs-64way  OR mfile:dynamic:64 What a...

Abinitio Interview Questions 28 Abinitiorc Files

Image
Abinitio Interview Questions 28 Abinitiorc Files Watch my YouTube video for explanation :                               What are the different abinitiorc files    30.1  Global Abitiorc file /etc/abinitio/abinbitiorc             At the level of host and it is accessible for all installation under the same host             a. its an optional abinitiorc file  b. it is global in nature as its declaration impacts all installation done under the host c. the variable of this can not be overridden d. can use include statement in this config file Use cases: AB_CHARSET AB_OUTPUT_FILE_UMASK UNIX Windows Super User(Admin)                     30.2. Server installation specific abinitiorc file $AB_HOME/config/abinitiorc   ...

Abinitio Interview Questions 27 Date Algorithms

Image
Abinitio Interview Questions 27 Date Algorithms Watch my YouTube video for explanation :                 Take the 2 dates and create end date  for first and last quarter. (First date of the quarter , Last Date of the quarter) There will be 2 output fields in the output file. DATE1  = 25102022/ 24022022 DATE2  = 24082023 / 11072023 fqdt=25102022  01102022 lqdt=24082023  30092023 Yr_part1=2022 = string_Substring(DATE1,5,4); Yr_part2=2023 = string_Substring(DATE2,5,4); let string(",")[int] quarter_dates1=["0101"+Yr_part1,'0104'+Yr_part1,'0107'+Yr_part1,'0110'+Yr_part1]; let string(",")[int] quarter_dates2=["3112"+Yr_part2,'3009'+Yr_part2,'3006'+Yr_part2,'3103'+Yr_part2]; for (i,i<4) begin qstartdt = if((date("DDMMYYYY))quarter_dates1[i]<(date("DDMMYYYY))fqdt) (date("DDMMYYYY))quarter_dates1[i]; qenddt = if((date("DDMMYYYY))quarter_dates2[i]>(date("DDMMYYYY))lqdt) (date...

Abinitio Interview Questions 26

Image
Watch my YouTube video for explanation :                Get the unique values from the input data Input code, country 1, India 2, Mumbai 1, USA 2, Newyork 1,          UK 2, Edinburgh 2, London output code, countries 2, India,Mumbai 2, USA,Newyork 2, UK,Edinburgh,London type temporary_type= record string(",") l_countries; decimal("") cnt; end; temp::initialize(in)= begin temp.l_countries::in.country; temp.cnt::0; end; out::key_change(in1,in2)= begin out::if(in1.code==1); end; out::rollup(temp,in)= begin out.l_countries::if(temp.cnt!0) string_concat(temp.countries,',',in.country); out.cnt::temp.cnt+1; end; out::finalize(temp,in)= begin out.code::in.code; out.countries::temp.l_countries; end; output code, countries 2, India,Mumbai 2, USA,Newyork 2, UK,Edinburgh For more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel...

Abinitio Interview Questions 25 Day to Day Abinitio Commands

Image
     Abinitio Interview Questions 25 Day to Day Abinitio Commands        Watch my YouTube video for explanation :                     a. How to know depth of MFS    m_expand -n <mfs> b. air sandbox diff   ==> Display the differences between 2 graphs     comparing 2 files in different sandboxes air sandbox diff  <sandbox1-path>/a.dml <sandbox2-path>/a.dml comparing sandbox and TR file air sandbox diff my-sandbox/dml/b.dml comparing graph of sandbox with current eme version graph air sandbox diff -version current my-sand/mp/p.mp comparing TR version and sandbox graph air sandbox diff <version of eme>  my-sand/mp/p.mp c. count the no of fields in a file head -1 file-name | sed 's/|/\n/g'| wc -l d. plan-admin set parameter-name new-value e. m-password -prompt/-password   -restrict restriction | -unre...

Abinitio Interview Questions 24 Abinitio Parallelism Advanced

Image
Abinitio Interview Questions 24 Abinitio Parallelism   Advanced Watch my YouTube video for explanation :              Abinitio Parallelism component parallelism The AI component who increases component parallelism                         replicate input file reformat FBE PBE dedup sorted split The AI component who decreases component parallelism gather                 join fuse combine concatenate pipeline parallelism                         SORT ROLLUP JOIN Dedup Sorted Phases / Check Points component folding Continuous Graphs:  Checkpoint-->  Compute points   data parallelism types of layouts:     data layout     processing layout file:serial mfile:mfs...

Abinitio Interview Questions 23 Multifile System Part 2

Image
                           Abinitio Interview Questions 23 Multifile System Part 2 Watch my YouTube video for explanation :                Please look into the class notes here for your references: b. Use m_partition command m_partition <source-url-path> <destination-url-path> <DML> [KEY]    partitions  a serial file into a multifile    repartition a multifile into another multifile   c. m_mv, m_cp ,m_gzip,m_gunzip d. If same depth - copy serial location one by one (manual work around) target server: m_touch mfs-file-name src cp <partition#0 of source> <partition#0 of destination>  cp <partition#1 of source> <partition#1 of destination>  cp <partition#n-1 of source> <partition#n-1 of destination> 

Abinitio Interview Questions 22 Multifile System Part 1

Image
Watch my YouTube video for explanation :                                                                       a. Create a graph and use                                                                        <all-to-all> Input->Partitioning Component     <      =      >    gather/merge -> Output Fore more Abinitio, AWS and data engineering videos please subscribe , view , like and share my YouTube channel  Click  DataPundit

Unix Miscellaneous Commands | 10 Useful UNIX commands

Image
Unix Miscellaneous Commands  | 10 Useful UNIX commands For Class Notes: Unix  Commands and Use Cases- 1. tar command  what is it , why we use it tar <options> <archivename> <files to be archived> tar -xzvf myfiles.tar.gz tar -xzvf myfiles.tar.gz -C /dir/test tar -tvf myfiles.tar.gz tar -czvf myfiles.tar.gz /dir/test2 2. source command source abc.ksh 3. mount command   4. split split -l 200 myfile datapunit datapunita  datapunitb  datapunitc datapunitd datapunite split -b 200k myfile datapundit datapunita  datapunitb  datapunitc datapunitd datapunite 5. df /du df  mounted on , size in blocks , free , used , df --total du -d 1 /home/mandeep/test du -c -h /home/mandeep/test du -a -h /home/mandeep/test 6. sed '/exit/d' filename.txt 7. sed '2,$ s/unix/linux/' geekfile.txt 8. awk '/UUID/ {print $0}' /etc/fstab 9. finger David cat emailist.txt  a@gmai.com b@datapunit.org.in 10. echo -e "Hi There, \n Please Find the attached ...

Different ways to create Abinitio MFS or How to create MFS in Abinitio

Image
Different ways to create Abinitio MFS | How to  Different ways to create Abinitio MFS | How Different ways to create Abinitio MFS | How to create MFS in Abinitio | multifile system in Abinitio For class notes as below: 1.  build_mfs -env <path-to-stdenv>/AB_ENV_ROOT data_areas disk1/p1 disk2/p2 -mfs-depth 4 Or single Directory MFS - hdfs , s3a,gs,azure(blob) - wasb wasbs azure datalake-abfss mvs s3a:////bucket/pdir/ gs://hotname/pdir/ wasb://containeraccount/dir/path build-mfs -basedir /abinitio/apps/ste-enable/dataPundit/stdenv -mfs-depth 8 -mfs-mount /abinitio/data/papaEarth/dataPundit/mfs -data-areas /abinitio/data/papaEarth/dataPundit/mfs/parts build-mfs -dynamic -singledir -basedir /abinitio/apps/sandboxes/papaEarth/Projects/sand/stdenv -mfs-depth 64 -mfs-mount /abinitio/data/papaEarth/dataPundit/mfs 2. m_mkfs  3. install_environement mfs-depth, data_areas 4. m_touch - //datapundit.in/user/nd/mfs/mfs-4way/dust/a.dat //datapundit.in/user/nd/mfs/mfs-4way/dus...

DynamoDb Schema Design as an Expert Way

Image
Watch my YouTube video for explanation : Please look into the class notes here for your references: DynamoDB Schema Design      **Limit the access pattern **Throttling  meaning the reads are more than defined RCU **data latency , storage Use NOSQL WorkBench Create Directly in DynamoDB Use PartiQL   [Partition Key]           {Primary Key} [Partition Key,Sort Key]  {composite primary key} Partition Key  - Storage Index Sort Key  - Ordering  dept, opening_date 1, 12-03-2022 2, 22-09-2020 2, 21-12-2020 3, 17-06-2021 4, 16-05-2021 Partitions  --->partition 1    1, 12-03-2022 3, 17-06-2021 partition 2     2, 22-09-2020 2, 21-12-2020 Case I when we lookup of a known global unique key empid, joining _date 1, 16-05-2021 2, 21-12-2020 3, 17-06-2021 use the partition key Case II when the key is non unique OR when the...