ktmatu
Home | Site Map | Site Search
Home > Information >
 

One-liners

This page is a collection of one-liners I have found useful. These short command line tools have been tested on Solaris tcsh and bash shells, but many of them can also be used on Windows.

One-liners to Handle Web Server Log Files

Here it is assumed that the web server log files (access_log*) are in Combined Format.

How to view log files?

Less can easily handle huge files.

Without line wrapping
% less -S access_log
Long lines wrapped
% less access_log

How to view compressed log files?

To save space log files are usually compressed.

Gzip (GNU zip) compressed file (.gz)
% gzip -dc access_log.gz | less
Bzip2 compressed file (.bz2)
% bzip2 -dc access_log.bz2 | less

How many lines (hits) there are in the log file?

Uncompressed log file
% wc -l access_log
33894
Gzip compressed log file
% gzip -dc access_log.gz | wc -l 
33894

How many page views?

Uncompressed log file
% egrep '(\.gif |\.jpg |\.png )' access_log
2569
Compressed log file
% gzip -dc access_log.gz | egrep -vc '(\.gif |\.jpg |\.png )'
2569

How many hits today?

% grep -c `date '+%d/%b/%Y'` access_log
2569

How many unique visitors today?

% grep `date '+%d/%b/%Y'` access_log | cut -d" " -f1 | sort -u | wc -l
1196

How many hits in a particular day?

Uncompressed log file, e.g. January 1, 2001
% grep -c 01/Jan/2001 access_log
2569
Compressed log file, e.g. January 1, 2001
% gzip -dc access_log.gz | grep -c 01/Jan/2001
2569

What period is covered covered in the log?

Uncompressed log file
% head -1 access_log; tail -1 access_log
foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ...
bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...
Uncompressed log file
% gzip -dc access_log.gz | head -1 ; gzip -dc access_log.gz | tail -1
foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ...
bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...

Are there missing dates?

Uncompressed log file
% cut -d" " -f4 access_log | cut -d"/" -f1 | uniq
[30
[31
[01
[03
[04
[05
[06
Compressed log file
% gzip -dc wlog0101.gz | cut -d" " -f4 | cut -d"/" -f1 | uniq
[30
[31
[01
[03
[04
[05
[06

How many corrupted log entries?

This is just a very quick and dirty way to check the log.

Uncompressed log file
% perl -ane 'print $_ if (scalar (split /\"/)) != 7' access_log | wc -l
       7
Compressed log file
% gzip -dc access_log.gz | perl -ane 'print $_ if (scalar (split /\"/)) != 7' | wc -l
       7

How does the line number 15927 or lines 15920 - 15929 look like?

Uncompressed log file
% grep -n '.*' access_log | grep '^15927\:'
15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
% grep -n '.*' access_log | grep '^1592.\:'
15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ...
15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
...
Compressed log file
% gzip -dc access_log.gz | grep -n '.*' | grep '^15927\:'
15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
% gzip -dc access_log.gz | grep -n '.*' | grep '^1592.\:'
15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ...
15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
...

How to figure out the bandwith consumption (in bytes)?

Today:
% grep `date '+%d/%b/%Y'` access_log | awk '{ s += $10 } END {print s}'
13113756
This month:
% grep `date '+../%b/%Y'` access_log | awk '{ s += $10 } END {print s}'
569477018
Used by Googlebot:
% grep googlebot access_log | awk '{ s += $10 } END {print s}'
29832233
Used by some rogue user from IP-address 169.254.22.12:
% grep ^169.254.22.12 access_log | awk '{ s += $10 } END {print s}'
46760880

How to delete partial GET requests from the log?

Partial content requests are usually generated by download managers to speed the downloading of big files and Adobe Acrobat Reader to fetch PDF documents page by page. In this example 206 requests generated by Acrobat reader are deleted so that they don't infate the hit count.

% grep -v '\.pdf .* 206 ' access_log > new_log

How to compress a selected portion from a log?

Use gzip to compress log entries in May 2002
% grep ' \[../May/2002\:' access_log | gzip -9c > access_log-2002-05.gz
Use bzip2 to compress log entries in May 2002
% grep ' \[../May/2002\:' access_log | bzip2 > access_log-2002-05.bz2

See in real-time how the log file grows?

Using tail
% tail -f access_log
With less you must hit "F" (and Ctrl-C q to quit)
% less access_log
 
Home | Software | Information | Etsin | Chinese | Christmas Calendars | Site Info