<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ilya Boyandin &#187; perl</title>
	<atom:link href="http://blog.boyandi.net/category/perl/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.boyandi.net</link>
	<description>notes about ui, web development, visualization. links, tips and tricks</description>
	<lastBuildDate>Sat, 30 Jul 2011 22:04:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Processing rolling logfiles backwards with Perl</title>
		<link>http://blog.boyandi.net/2007/11/05/processing-rolling-logfiles-with-perl/</link>
		<comments>http://blog.boyandi.net/2007/11/05/processing-rolling-logfiles-with-perl/#comments</comments>
		<pubDate>Mon, 05 Nov 2007 10:14:07 +0000</pubDate>
		<dc:creator>Ilya Boyandin</dc:creator>
				<category><![CDATA[perl]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://blog.boyandi.net/?p=10</guid>
		<description><![CDATA[<p>I needed to write a script which runs on the server, gets daily statistics from our rolling log files and sends them by email. The log is splitted into 64Mb files and the total size is limited by 1GB. I didn&#8217;t want to parse the whole gigabyte of logs, so I decided to start from [...]]]></description>
			<content:encoded><![CDATA[<p>I needed to write a script which runs on the server, gets daily statistics from our rolling log files and sends them by email. The log is splitted into 64Mb files and the total size is limited by 1GB. I didn&#8217;t want to parse the whole gigabyte of logs, so I decided to start from the most recent file and read it in the backward order till the previous date. I found two approaches to read file backwards with perl. The first approach is slower but requires a fixed amount of memory (in my case about 2 mins and up to 16Mb RAM):</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;"><span style="color: #000066;">tie</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">@lines</span><span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;Tie::File&quot;</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$fname</span><span style="color: #339933;">,</span> mode <span style="color: #339933;">=</span><span style="color: #0000ff;">&amp;gt</span><span style="color: #339933;">;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">or</span> <span style="color: #000066;">die</span> <span style="color: #ff0000;">&quot;Can't tie $fname: $!&quot;</span><span style="color: #339933;">;</span>
<span style="color: #0000ff;">$max_lines</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$#lines</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$i</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$max_lines</span><span style="color: #339933;">;</span> <span style="color: #0000ff;">$i</span><span style="color: #339933;">;</span> <span style="color: #0000ff;">$i</span><span style="color: #339933;">--</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #b1b100;">not</span> <span style="color: #0000ff;">&amp;amp</span><span style="color: #339933;">;</span><span style="color: #0000ff;">$apply</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$lines</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000066;">return</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>The second approach is faster but requires at least as much available memory as the size of the log file, in fact more than twice as much when processing several files one by one (in my case, 1:02 min and up to 150Mb RAM):</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;"><span style="color: #000066;">open</span><span style="color: #009900;">&#40;</span>LOG<span style="color: #339933;">,</span> <span style="color: #0000ff;">$fname</span><span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">or</span> <span style="color: #000066;">die</span> <span style="color: #ff0000;">&quot;Can't open $fname: $!&quot;</span><span style="color: #339933;">;;</span>
<span style="color: #0000ff;">@lines</span> <span style="color: #339933;">=</span> <span style="color: #000066;">reverse</span> <span style="color: #0000ff;">&amp;lt</span><span style="color: #339933;">;</span>LOG<span style="color: #0000ff;">&amp;gt</span><span style="color: #339933;">;;</span>
<span style="color: #b1b100;">foreach</span> <span style="color: #0000ff;">$line</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">@lines</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #000066;">chomp</span> <span style="color: #0000ff;">$line</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #b1b100;">not</span> <span style="color: #0000ff;">&amp;amp</span><span style="color: #339933;">;</span><span style="color: #0000ff;">$apply</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$line</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000066;">close</span><span style="color: #009900;">&#40;</span>LOG<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000066;">return</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000066;">close</span><span style="color: #009900;">&#40;</span>LOG<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>So I chose the first approach, because I thought that stable memory footprint on the server is more important than the time the script takes to complete. Here is the full code for the package I wrote that provides handy functions for scanning rolling logfiles forwards and backwards: <a href="http://blog.boyandi.net/wp-content/uploads/2007/10/logscan_pl.txt" title="logscan_pl.txt">logscan_pl.txt</a></p>
<p>And here is a usage sample:</p>

<div class="wp_syntax"><div class="code"><pre class="perl" style="font-family:monospace;"><span style="color: #000066;">require</span> <span style="color: #ff0000;">'logscan.pl'</span><span style="color: #339933;">;</span>
<span style="color: #0000ff;">$errors</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
logscan<span style="color: #339933;">::</span><span style="color: #006600;">scan</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;Joanna&quot;</span><span style="color: #339933;">,</span> <span style="color: #000000; font-weight: bold;">sub</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #0000ff;">$_</span> <span style="color: #339933;">=</span> <span style="color: #000066;">shift</span><span style="color: #339933;">;</span>
	<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009966; font-style: italic;">/(.*) INFO.*Returning error by [([^]]*)].*user=[([^]]*)].*roles=[([^]]*)]/</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #b1b100;">my</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$time</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$uri</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$user</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$role</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$1</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$2</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #0000ff;">$errors</span><span style="color: #339933;">++;</span>
		<span style="color: #000066;">print</span> <span style="color: #ff0000;">&quot;$time $uri $user $rolen&quot;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #000066;">return</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000066;">print</span> <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>n<span style="color: #000099; font-weight: bold;">\\</span>nTotal errors: $errors<span style="color: #000099; font-weight: bold;">\\</span>n&quot;</span><span style="color: #339933;">;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://blog.boyandi.net/2007/11/05/processing-rolling-logfiles-with-perl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

