As we mentioned earlier, there are times when the web server goes wild and starts to rapidly log lots of messages to the error_log file. If no one monitors this, it is possible that in a few minutes all free disk space will be consumed and no process will be able to work normally. When this happens, the faulty server process may cause so much I/O that its sibling processes cannot serve requests.

Although this rarely happens, you should try to reduce the risk of it occurring on your server. Run a monitoring program that checks the log file size and, if it detects that the file has grown too large, attempts to restart the server and trim the log file.

Back when we were using quite an old version of mod_perl, we sometimes had bursts of "Callback called exit" errors showing up in our error_log. The file could grow to 300 MB in a few minutes.

Example 5-11 shows a script that should be executed from crontab to handle situations like this. This is an emergency solution, not to be used for routine log rotation. The cron job should run every few minutes or even every minute, because if the site experiences this problem, the log files will grow very rapidly. The example script will rotate when error_log grows over 100K. Note that this script is still useful when the normal scheduled log-rotation facility is working.

Example 5-11.

S=`perl -e 'print -s "/home/httpd/httpd_perl/logs/error_log"'`;
if [ "$S" -gt 100000 ] ; then
    mv /home/httpd/httpd_perl/logs/error_log \
    /etc/rc.d/init.d/httpd restart
    date | /bin/mail -s "error_log $S kB"

Of course, a more advanced script could be written using timestamps and other bells and whistles. This example is just a start, to illustrate a basic solution to the problem in question.

Another solution is to use ready-made tools that are written for this purpose. The daemontools package includes a utility called multilog that saves the STDINstream to one or more log files. It optionally timestamps each line and, for each log, includes or excludes lines matching specified patterns. It automatically rotates logs to limit the amount of disk space used. If the disk fills up, it pauses and tries again, without losing any data.

The obvious caveat is that it does not restart the server, so while it tries to solve the log file-handling issue, it does not deal with the problem's real cause. However, because of the heavy I/O induced by the log writing, other server processes will work very slowly if at all. A normal watchdog is still needed to detect this situation and restart the Apache server.