#!/usr/bin/perl -w # This is an example of using named capture containers to parse data from lines # when the columns of data could move around, new columns are added and even if # the column is removed. # Matching a pattern with multiple parts cannot deal with columns that move or are missing. # Each part must be matched on it's own line. # Every time a match is made it is put into the hash for use later. # sample3.data as input ####################### # 01:00:00 httpd=on sshd=on crond=on ntpd=on winbind=off cups=off # 01:01:00 [567] httpd=on sshd=on crond=on ntpd=on winbind=off cups=off # 01:02:00 [567] crond=on ntpd=on winbind=off httpd=off cups=off sshd=on # 01:03:00 [PID:567] ntpd=on winbind=off cups=off sshd=on named=on httpd=on crond=on # 01:04:00 [PID:567] named=on httpd=on crond=on # results from output ####################### # Time httpd ntpd sshd # 01:00:00 on on on # 01:01:00 on on on # 01:02:00 off on # 01:03:00 on on on # 01:04:00 on my %DataSet; # The data set is dynamically gathered so to change # the report just add or remove column names here. my @ReportFields = ( "httpd", "ntpd", "sshd" ); sub Pack { my $Time = shift; my $FieldName = shift; my $DataValue = shift; $DataSet{$Time}->{$FieldName} = $DataValue; } while (<>) { $Line = $_; $Line =~ s/\n//; $Line =~ m/^(?<time>\d\d:\d\d:\d\d).*/; my $NewTime = $+{time}; foreach my $R ( sort @ReportFields) { if ($Line =~ m/.* \Q$R\E=(?<value>\w*) .*/) { Pack($NewTime, $R, $+{value}) }; } } printf "Time\t\t"; foreach my $F ( sort @ReportFields ) { printf $F."\t"; } printf "\n"; foreach my $Time (sort keys %DataSet) { printf $Time."\t"; foreach my $F ( sort @ReportFields ) { printf " ".$DataSet{$Time}->{$F}."\t" if (defined $DataSet{$Time}->{$F}); } printf "\n"; }
Thursday, 27 July 2017
Parsing a moving target.
A problem was presented to me where the raw data was expected to have columns added and moved around at some point but a report was needed that could deal with this. I chose to use Perl Named Capture Containers to solve this problem.
Labels:
Perl
Subscribe to:
Post Comments (Atom)
One more thing to note is that my example uses data fields of the same format so I was able to use a for loop to match the fields I want. A typical log file will have a variety of data recorded and multiple lines with different regular expressions will be needed.
ReplyDeleteHappy hacking.