Thursday, 27 July 2017

Parsing a moving target.

A problem was presented to me where the raw data was expected to have columns added and moved around at some point but a report was needed that could deal with this. I chose to use Perl Named Capture Containers to solve this problem.


#!/usr/bin/perl -w

# This is an example of using named capture containers to parse data from lines 
# when the columns of data could move around, new columns are added and even if 
# the column is removed.

# Matching a pattern with multiple parts cannot deal with columns that move or are missing.
# Each part must be matched on it's own line.
# Every time a match is made it is put into the hash for use later.

# sample3.data as input
#######################
# 01:00:00 httpd=on sshd=on crond=on ntpd=on winbind=off cups=off
# 01:01:00 [567] httpd=on sshd=on crond=on ntpd=on winbind=off cups=off
# 01:02:00 [567] crond=on ntpd=on winbind=off httpd=off cups=off sshd=on
# 01:03:00 [PID:567] ntpd=on winbind=off cups=off sshd=on named=on httpd=on crond=on 
# 01:04:00 [PID:567] named=on httpd=on crond=on

# results from output
#######################
# Time  httpd ntpd sshd 
# 01:00:00  on  on  on 
# 01:01:00  on  on  on 
# 01:02:00  off  on 
# 01:03:00  on  on  on 
# 01:04:00  on 


my %DataSet; 
# The data set is dynamically gathered so to change
# the report just add or remove column names here.
my @ReportFields = ( "httpd", "ntpd", "sshd" );

sub Pack {
 my $Time = shift;
 my $FieldName = shift;
 my $DataValue = shift;
 $DataSet{$Time}->{$FieldName} = $DataValue;
}

while (<>) {
 $Line = $_;
 $Line =~ s/\n//;
 $Line =~ m/^(?<time>\d\d:\d\d:\d\d).*/;
 my $NewTime = $+{time};

 foreach my $R ( sort @ReportFields) {
  if ($Line =~ m/.* \Q$R\E=(?<value>\w*) .*/) { Pack($NewTime, $R, $+{value}) }; 
 }
}

printf "Time\t\t";
foreach my $F ( sort @ReportFields ) {
 printf $F."\t";
}
printf "\n";

foreach my $Time (sort keys %DataSet) {
 printf $Time."\t";
 foreach my $F ( sort @ReportFields ) {
  printf " ".$DataSet{$Time}->{$F}."\t" if (defined $DataSet{$Time}->{$F});
 }
 printf "\n";
}

Monday, 3 July 2017

Stop eating my pipe!

We love to use while loops in our scripts and they are a great way to read a file one line at a time to get a job done.

For example:

cat myFile.txt | while read LINE; do
   echo $LINE
   sleep 1
done


Now comes along SSH and lets say that your file contains a list hostnames you want to get uptime from:

cat myHostsFile.txt | while read LINE; do
   echo -n "Host = $LINE "
   ssh $LINE "updtime"
done


How disappointed you are when your loops stops after the first host. This is because every child process inherits it's first three file descriptors from it's parent, so SSH takes everything from STDIN for itself.

Sometimes you may want that but this time you don't. What can you do?

The simple solution here is to disassociate SSH from STDIN, and you do this using a simple re-director.

cat myHostsFile.txt | while read LINE; do
   echo -n "Host = $LINE "
   ssh 0>/dev/zero $LINE "updtime"
done


Something to keep in mind is that STDIN supplies a data stream, it does not take it. So for this reason we attache to /dev/zero not /dev/null.