Processing rolling logfiles backwards with Perl

I needed to write a script which runs on the server, gets daily statistics from our rolling log files and sends them by email. The log is splitted into 64Mb files and the total size is limited by 1GB. I didn’t want to parse the whole gigabyte of logs, so I decided to start from the most recent file and read it in the backward order till the previous date. I found two approaches to read file backwards with perl. The first approach is slower but requires a fixed amount of memory (in my case about 2 mins and up to 16Mb RAM):

tie(@lines, "Tie::File", $fname, mode => 0) or die "Can't tie $fname: $!";
$max_lines = $#lines;
for ($i = $max_lines; $i; $i--) {
	if (not &$apply($lines[$i])) {
		return 0;
	}
}

The second approach is faster but requires at least as much available memory as the size of the log file, in fact more than twice as much when processing several files one by one (in my case, 1:02 min and up to 150Mb RAM):

open(LOG, $fname) or die "Can't open $fname: $!";;
@lines = reverse <LOG>;
foreach $line (@lines) {
	chomp $line;
	if (not &$apply($line)) {
		close(LOG);
		return 0;
	}
}
close(LOG);

So I chose the first approach, because I thought that stable memory footprint on the server is more important than the time the script takes to complete. Here is the full code for the package I wrote that provides handy functions for scanning rolling logfiles forwards and backwards: logscan_pl.txt

And here is a usage sample:

require 'logscan.pl';
$errors = 0;
logscan::scan("Joanna", sub {
	$_ = shift;
	if (/(.*) INFO.*Returning error by [([^]]*)].*user=[([^]]*)].*roles=[([^]]*)]/) {
		my ($time, $uri, $user, $role) = ($1, $2, $3);
		$errors++;
		print "$time $uri $user $rolen";
	}
	return 1;
});
print "\\n\\nTotal errors: $errors\\n";

Leave a comment