<!– @page { size: 21cm 29.7cm; margin: 2cm } P { margin-bottom: 0.21cm } H1 { margin-bottom: 0.21cm } H1.western { font-family: “Albany”; font-size: 16pt } H1.cjk { font-family: “HG Mincho Light J”; font-size: 16pt } H1.ctl { font-family: “Lucidasans”; font-size: 16pt } –>
Perhaps you know the situation: You’re happy not having to add the same six lines to your httpd.conf because you got mod_vhost_alias to work. You feel fine, the sun’s shining, birds are singing, the full monty. Everything seems quite usual except one thing: your awstats are going to burst your system on each run. The more often you cronjob them, the sooner you’ll know about it. Now you realise that letting awstats_updateall parse your 300 MB daily all-in-one-access.log several dozen times per day might not be such a good idea. :-) So you’re looking for a suitable solution. Great. This is why you’re here.
You’ll find the script underneath this little howto, but I would suggest to read it before running the script.
The situation
We have a mixed environment consisting of mass vhosts, SSL- and Tomcat hosts. Therefore, we have two different kinds of logfiles: A huge all-in-one for the mass vhosts and several vhost-based files. As a result, we’ll have to change every awstats.$HOST.conf to fit into our new situation. Nevertheless, all awstats hosts parse the same big log file.
Solution I
Let awstats parse the All-in-one-logfile n times. Possible but somewhat brutal.
Solution II
You just could use split-logfile. Great! Just change your CommonLog’s format to vhost and adapt the awstats configs accordingly. But if you do the nasty tweaky stuff offered by mod_rewrite, you’ll miss all the host aliases (formerly noted via ServerAlias).
Solution III
Use my script. :-)
Here is what’s you’ll have to do:
-
Download the script (I suggest to save it in one of the sbin folders)
-
chmod +x awstats_manager.sh
-
edit it with your preferred editor and change the variables to your needs. Keep $TEMPDIR‘s value in mind
-
backup your awstats configs (mv $awstatsdir $awstatsdir.orig)
-
now fill in the SiteDomain,HostAliases and the LogFile
The latter one has to have the following syntax: $TEMPDIR/merged/$SiteDomain.log as awstats will find its freshmade configs there
If you already have a bunch of mass hosts and don’t want to change them manually, change awstats.model.conf and run the following:
newlog=/tmp/apache2/merged/;for file in awstats.www*; do cp awstats.model.conf $file; echo “SiteDomain=\”`echo $file | sed -e ‘s/^awstats.//g;s/.conf$//g’`\”" >> $file; echo “HostAliases=\”REGEX[`echo $file | sed -e 's/^awstats.www.//g;s/.conf$//g'`]\”" >> $file; echo “LogFile=\”$newlog`echo $file | sed -e ‘s/awstats.//g;s/conf$/log/g’`\”" >> $file; done
This will do the following: newlog is the name of your new logfile, SiteDomain will be the domain name included in theawstats file name and HostAliases will be just the domain.
Here’s an example: awstats.www.example.com.conf will set SiteDomain to www.example.com, HostAliases to REGEX[example.com] and Logfile to /tmp/apache2/merged/www.example.com.log -
Done? Great. Now add awstats_manager.sh to your cron and you’re done: ln -s /path/to/ awstats_manager.sh /etc/cron.daily/
Now you know what you have to do, you should know what the script does….
-
create the temporary directories and chmod them 0700
-
copy the big log file and split it
-
create a list of awstats configs which get their input from logfiles in this temporary directory
-
fetch SiteDomain and HostAliases for each of the files
-
check for empty entries and duplicates
-
extract what is regularly expressed
-
find the corresponding log files (if they exist)
-
run logrotatemerge.pl with all of the found log files
-
run awstats -update for the corresponding host
-
afterwards find all the other awstats hosts and update them as well
-
clean up
Thats’s all. Have a lot of fun! Comments are welcome.
Download: awstats_manager.sh