Using awk for data manipulation - Unix & Linux Stack Exchange

   Important AWK commands:

This blog is useful for beginner and intermediate data engineers who want to do advance in the filed of data engineering and data administration.

   




To listen to you tube video please click link below:

AWK A few More Cases Day 2 Day usage

Following are 5 examples which may be  used in daily use cases to solve real time problems.

1.

Many a times we deal with configuration file such as .config and .YAML file (which is mostly used in

java projects). Data storage happens as Key-Value pair in these configuration files,  the key and value are separated by a separator such as colon (:) or equal to(=) sign, for example weather_detail.config is shown as below.

we need to retrieve the value for the key "Country"

config, yaml

Solution A.

awk -F : '{ if($1=='Country') print $2}' weather_detail.config


Solution B.

cat <filename> |grep 'Country' | awk -F : '{print $2}'


weather_detail.config

Region:APAC

Country:India


2. 

In similar line, if requirement is to get the size of the file in unix file system or to get the file name in the unix file system directory.

ls -lrt |tail -1 | awk '{print $5}'  -size check

ls -lrt |tail -1 | awk '{print $9}'  - file name


3.

Get the latest most start_ts from a huge log file

grep "Data : start_ts"  mygraph_20220202.log| tail -1| awk -F : '{priint $3}'


cat mygraphlog_file_20220803.log

.....

.....

.....

Japan Data:start_ts 20220802 has been generated:japan_clg_report.20220802120908888.pdf

.........

.........

.........

Egypt Data:start_ts 20220804 has been generated:japan_clg_report.20220804120909880.pdf



4. report generation - well formated log/issue logs


for dir in $dir1 $dir2 $dir3

      for file in `ls -lrt $dir`

        cat $file | grep "Generated on $revenuedt" | awk -F : '{print $2}'

   

cat aum_data_file_20220803.rpt

.....

.....

.....

aum data file has been generated on 20220802 file_name:japan_clg_report.20220802120908888.rpt

.........

.........

.........

aum data file has been generated on 20220802 file_name:japan_clg_report.20220802120908888.rpt




5. Generating string while combining fields

ls -lart | awk '{print $9}' | sed s/-//g  | touch

 

a-abc-dfh-lop.pdf  ---> aabcdfhlop.pdf

actual-physical-path.log --->actualphysicalpath.log 



                              Authored by datapunditeducation@gmail.com 
















Comments

Popular posts from this blog

Abinitio Interview Question # 1 - Write Multiple Files in Abinitio

Next In Sequence in ABinitio | How next_in_sequence() works in MFS