Let’s assume you have a web application that runs of a cluster of
Apache nodes. Each node generates its own Apache access log from which
you can generate page view statistics with tools such as Webalizer or AWStats.
Obviously you do not want to have page view statistics for each Apache
node, but overall page view statistics. To achieve this, we must merge
the access logs from each node into one overall access log that we can
then feed into Webalizer or AWstats. There is a Perl script called logresolvemerge.pl (part of the AWStats package) that can do this for us.
I do not issue any guarantee that this will work for you!
1 Preliminary Note
I have tested this on a Debian system, but the procedure is the same
on every other distribution except for the package installation. Use
your distribution’s package manager (e.g. apt, yum, yast, urpmi) to
install the packages.
I’m assuming that you are using a single host (typically this is the
host where you run Webalizer or AWStats to generate the statistics) to
collect the access logs from the Apache nodes (I don’t cover how to
transfer the access logs from the Apache nodes to the host where we
collect the logs – you could do that with rsync, for example, as shown
in this tutorial: Mirror Your Web Site With rsync) – I’m using the directory /var/log/webcluster here to store the access logs of the Apache nodes.