Friday, 6 February 2015

Extracting named columns of data with awk

Some times when a vendor updates a diagnostic tool they change the order of the fields so it is not a good idea to extract columns of data by index. This simple awk script is able to get data by the name of the column rather then the index.

The samples contain the same data but the columns are in different orders. The results from both files should be the same.

samp0.txt
id;width;rank;age;height
1;900;4;20;500
2;1900;11;32;200
3;70;8;43;50

samp1.txt

id;age;height;width;rank
1;20;500;900;4
2;32;200;1900;11
3;43;50;70;8

GetField.awk

BEGIN { 
 FS=";"
 C=0
} {  
 if((C > 0) && ($C!="")) {
  print $C
 }
 
 if(C==0) {
  for(i=1; i<=20; i++) {
   if($i == ARGV[2]) {
    C=i
   }
  }
 }
}

Run the commands
]$ cat samp0.txt | awk -f GetField.awk - age 2>/dev/null
20
32
43
]$ cat samp1.txt | awk -f GetField.awk - age 2>/dev/null
20
32
43

No comments:

Post a Comment