Now let's get back to this section's main issue, safe resource locking. If you don't make a habit of closing all files that you open, you may encounter many problems (unless you use the Apache::PerlRun handler, which does the cleanup for you). An open file that isn't closed can cause file-descriptor leakage. Since the number of file descriptors available is finite, at some point you will run out of them and your service will fail. This will happen quite fast on a heavily used server.

You can use system utilities to observe the opened and locked files, as well as the processes that have opened (and locked) the files. On FreeBSD, use the fstat utility. On many other Unix flavors, use lsof. On systems with a /proc filesystem, you can see the opened file descriptors under /proc/PID/fd/, where PID is the actual process ID.

However, file-descriptor leakage is nothing compared to the trouble you will give yourself if the code terminates and the file remains locked. Any other process requesting a lock on the same file (or resource) will wait indefinitely for it to become unlocked. Since this will not happen until the server reboots, all processes trying to use this resource will hang.

Example 6-34 is an example of such a terrible mistake.

Example 6-34. flock.pl

use Fcntl qw(:flock);
open IN, "+>>filename" or die "$!";
flock IN, LOCK_EX;
# do something
# quit without closing and unlocking the file

Is this safe code? No—we forgot to close the file. So let's add the close( ), as in Example 6-35.

Example 6-35. flock2.pl

use Fcntl qw(:flock);
open IN, "+>>filename" or die "$!";
flock IN, LOCK_EX;
# do something
close IN;

Is it safe code now? Unfortunately, it is not. If the user aborts the request (for example, by pressing the browser's Stop or Reload buttons) during the critical section, the script will be aborted before it has had a chance to close( ) the file, which is just as bad as if we forgot to close it.

In fact, if the same process runs the same code again, an open( ) call will close( ) the file first, which will unlock the resource. This is because IN is a global variable. But it's quite possible that the process that created the lock will not serve the same request for a while, since it might be busy serving other requests. During that time, the file will be locked for other processes, making them hang. So relying on the same process to reopen the file is a bad idea.

This problem happens only if you use global variables as file handles. Example 6-36 has the same problem.

Example 6-36. flock3.pl

use Fcntl qw(:flock);
use Symbol ( );
use vars qw($fh);
$fh = Symbol::gensym( );
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# do something
close $fh;

$fh is still a global variable, and therefore the code using it suffers from the same problem.

The simplest solution to this problem is to always use lexically scoped variables (created with my( )). The lexically scoped variable will always go out of scope (assuming that it's not used in a closure, as explained in the beginning of this chapter), whether the script gets aborted before close( ) is called or you simply forgot to close( ) the file. Therefore, if the file was locked, it will be closed and unlocked. Example 6-37 is a good version of the code.

Example 6-37. flock4.pl

use Fcntl qw(:flock);
use Symbol ( );
my $fh = Symbol::gensym( );
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# do something
close $fh;

If you use this approach, please don't conclude that you don't have to close files anymore because they are automatically closed for you. Not closing files is bad style and should be avoided.

Note also that Perl 5.6 provides a Symbol.pm-like functionality as a built-in feature, so you can write:

open my $fh, ">/tmp/foo" or die $!;

and $fh will be automatically vivified as a valid filehandle. You don't need to use Symbol::gensym and Apache::gensym anymore, if backward compatibility is not a requirement.

You can also use IO::* modules, such as IO::File or IO::Dir. These are much bigger than the Symbol module (as a matter of fact, these modules use the Symbol module themselves) and are worth using for files or directories only if you are already using them for the other features they provide. Here is an example of their usage:

use IO::File;
use IO::Dir;
my $fh = IO::File->new(">filename");
my $dh = IO::Dir->new("dirname");

Alternatively, there are also the lighter FileHandle and DirHandle modules.

If you still have to use global filehandles, there are a few approaches you can take to clean up in the case of abnormal script termination.

If you are running under Apache::Registry and friends, the END block will perform the cleanup work for you. You can use END in the same way for scripts running under mod_cgi, or in plain Perl scripts. Just add the cleanup code to this block, and you are safe.

For example, if you work with DBM files, it's important to flush the DBM buffers by calling a sync( ) method:

END {
    # make sure that the DB is flushed
    $dbh->sync( );
}

Under mod_perl, the above code will work only for Apache::Registry and Apache::PerlRunscripts. Otherwise, execution of the END block is postponed until the process terminates. If you write a handler in the mod_perl API, use the register_cleanup( ) method instead. It accepts a reference to a subroutine as an argument. You can rewrite the DBM synchronization code in this way:

$r->register_cleanup(sub { $dbh->sync( ) });

This will work under Apache::Registry as well.

Even better would be to check whether the client connection has been aborted. Otherwise, the cleanup code will always be executed, and for normally terminated scripts, this may not be what you want. To perform this check, use:

$r->register_cleanup(
  # make sure that the DB is flushed
  sub {
      $dbh->sync( ) if Apache->request->connection->aborted( );
  }
);

Or, if using an END block, use:

END {
    # make sure that the DB is flushed
    $dbh->sync( ) if Apache->request->connection->aborted( );
}

Note that if you use register_cleanup( ), it should be called at the beginning of the script or as soon as the variables you want to use in this code become available. If you use it at the end of the script, and the script happens to be aborted before this code is reached, no cleanup will be performed.

For example, CGI.pm registers a cleanup subroutine in its new( ) method:

sub new {
  # code snipped
  if ($MOD_PERL) {
      Apache->request->register_cleanup(\&CGI::_reset_globals);
      undef $NPH;
  }
  # more code snipped
}

Another way to register a section of cleanup code for mod_perl API handlers is to use PerlCleanupHandler in the configuration file:

<Location /foo>
    SetHandler perl-script
    PerlHandler        Apache::MyModule
    PerlCleanupHandler Apache::MyModule::cleanup( )
    Options ExecCGI
</Location>

Apache::MyModule::cleanup performs the cleanup.