We'll start by describing various approaches to writing configuration files, and their strengths and weaknesses.

If your configuration file contains only a few variables, it doesn't matter how you write the file. In practice, however, configuration files often grow as a project develops. This is especially true for projects that generate HTML files, since they tend to demand many easily configurable settings, such as the location of headers, footers, templates, colors, and so on.

A common approach used by CGI programmers is to define all configuration variables in a separate file. For example:

$cgi_dir  = '/home/httpd/perl';
$cgi_url  = '/perl';
$docs_dir = '/home/httpd/docs';
$docs_url = '/';
$img_dir  = '/home/httpd/docs/images';
$img_url  = '/images';
# ... many more config params here ...
$color_hint   = '#777777';
$color_warn   = '#990066';
$color_normal = '#000000';

The use strict; pragma demands that all variables be declared. When using these variables in a mod_perl script, we must declare them with use vars in the script, so we start the script with:

use strict;
use vars qw($cgi_dir $cgi_url $docs_dir $docs_url 
            # ... many more config params here ....
            $color_hint  $color_warn $color_normal
           );

It is a nightmare to maintain such a script, especially if not all features have been coded yet—we have to keep adding and removing variable names. Since we're writing clean code, we also start the configuration file with use strict;, so we have to list the variables with use vars here as well—a second list of variables to maintain. Then, as we write many different scripts, we may get name collisions between configuration files.

The solution is to use the power of Perl's packages and assign a unique package name to each configuration file. For example, we might declare the following package name:

package Book::Config0;

Now each configuration file is isolated into its own namespace. But how does the script use these variables? We can no longer just require( ) the file and use the variables, since they now belong to a different package. Instead, we must modify all our scripts to use the configuration variables' fully qualified names (e.g., referring to $Book::Config0::cgi_url instead of just $cgi_url).

You may find typing fully qualified names tedious, or you may have a large repository of legacy scripts that would take a while to update. If so, you'll want to import the required variables into any script that is going to use them. First, the configuration package has to export those variables. This entails listing the names of all the variables in the @EXPORT_OK hash. See Example 6-21.

Example 6-21. Book/Config0.pm

package Book::Config0;
use strict;

BEGIN {
  use Exporter ( );

  @Book::HTML::ISA       = qw(Exporter);
  @Book::HTML::EXPORT    = qw( );
  @Book::HTML::EXPORT_OK = qw($cgi_dir $cgi_url $docs_dir $docs_url
                              # ... many more config params here ....
                              $color_hint $color_warn $color_normal);
}

use vars qw($cgi_dir $cgi_url $docs_dir $docs_url 
            # ... many more config params here ....
            $color_hint  $color_warn $color_normal
           );

$cgi_dir  = '/home/httpd/perl';
$cgi_url  = '/perl';
$docs_dir = '/home/httpd/docs';
$docs_url = '/';
$img_dir  = '/home/httpd/docs/images';
$img_url  = '/images';
# ... many more config params here ...
$color_hint   = "#777777';
$color_warn   = "#990066';
$color_normal = "#000000';

A script that uses this package will start with this code:

use strict;
use Book::Config0 qw($cgi_dir $cgi_url $docs_dir $docs_url 
                     # ... many more config params here ....
                     $color_hint  $color_warn $color_normal
                  );
use vars          qw($cgi_dir $cgi_url $docs_dir $docs_url 
                     # ... many more config params here ....
                     $color_hint  $color_warn $color_normal
                  );

Whoa! We now have to update at least three variable lists when we make a change in naming of the configuration variables. And we have only one script using the configuration file, whereas a real-life application often contains many different scripts.

There's also a performance drawback: exported variables add some memory overhead, and in the context of mod_perl this overhead is multiplied by the number of server processes running.

There are a number of techniques we can use to get rid of these problems. First, variables can be grouped in named groups called tags. The tags are later used as arguments to the import( ) or use( ) calls. You are probably familiar with:

use CGI qw(:standard :html);

We can implement this quite easily, with the help of export_ok_tags( ) from Exporter. For example:

BEGIN {
  use Exporter ( );
  use vars qw( @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS );
  @ISA         = qw(Exporter);
  @EXPORT      = ( );
  @EXPORT_OK   = ( );

  %EXPORT_TAGS = (
      vars => [qw($firstname $surname)],
      subs => [qw(reread_conf untaint_path)],
  );
  Exporter::export_ok_tags('vars');
  Exporter::export_ok_tags('subs');
}

In the script using this configuration, we write:

use Book::Config0 qw(:subs :vars);

Subroutines are exported exactly like variables, since symbols are what are actually being exported. Notice we don't use export_tags( ), as it exports the variables automatically without the user asking for them (this is considered bad style). If a module automatically exports variables with export_tags( ), you can avoid unnecessary imports in your script by using this syntax:

use Book::Config0 ( );

You can also go even further and group tags into other named groups. For example, the :all tag from CGI.pm is a group tag of all other groups. It requires a little more effort to implement, but you can always save time by looking at the solution in CGI.pm's code. It's just a matter of an extra code to expand all the groups recursively.

As the number of variables grows, however, your configuration will become unwieldy. Consider keeping all the variables in a single hash built from references to other scalars, anonymous arrays, and hashes. See Example 6-22.

Example 6-22. Book/Config1.pm

package Book::Config1;
use strict;

BEGIN {
  use Exporter ( );

  @Book::Config1::ISA       = qw(Exporter);
  @Book::Config1::EXPORT    = qw( );
  @Book::Config1::EXPORT_OK = qw(%c);
}

use vars qw(%c);

%c = (
   dir => {
       cgi  => '/home/httpd/perl',
       docs => '/home/httpd/docs',
       img  => '/home/httpd/docs/images',
      },
   url => {
       cgi  => '/perl',
       docs => '/',
       img  => '/images',
      },
   color => {
         hint   => '#777777',
         warn   => '#990066',
         normal => '#000000',
        },
  );

Good Perl style suggests keeping a comma at the end of each list. This makes it easy to add new items at the end of a list.

Our script now looks like this:

use strict;
use Book::Config1 qw(%c);
use vars          qw(%c);
print "Content-type: text/plain\n\n";
print "My url docs root: $c{url}{docs}\n";

The whole mess is gone. Now there is only one variable to worry about.

The one small downside to this approach is auto-vivification. For example, if we write $c{url}{doc} by mistake, Perl will silently create this element for us with the value undef. When we use strict;, Perl will tell us about any misspelling of this kind for a simple scalar, but this check is not performed for hash elements. This puts the onus of responsibility back on us, since we must take greater care.

The benefits of the hash approach are significant. Let's make it even better by getting rid of the Exporterstuff completely, removing all the exporting code from the configuration file. See Example 6-23.

Example 6-23. Book/Config2.pm

package Book::Config2;
use strict;
use vars qw(%c);

%c = (
   dir => {
       cgi  => '/home/httpd/perl',
       docs => '/home/httpd/docs',
       img  => '/home/httpd/docs/images',
      },
   url => {
       cgi  => '/perl',
       docs => '/',
       img  => '/images',
      },
   color => {
         hint   => '#777777',
         warn   => '#990066',
         normal => '#000000',
        },
  );

Our script is modified to use fully qualified names for the configuration variables it uses:

use strict;
use Book::Config2 ( );
print "Content-type: text/plain\n\n";
print "My url docs root: $Book::Config2::c{url}{docs}\n";

To save typing and spare the need to use fully qualified variable names, we'll use a magical Perl feature to alias the configuration variable to a script's variable:

use strict;
use Book::Config2 ( );
use vars qw(%c);
*c = \%Book::Config2::c;
print "Content-type: text/plain\n\n";
print "My url docs root: $c{url}{docs}\n";

We've aliased the *c glob with a reference to the configuration hash. From now on, %Book::Config2::c and %c refer to the same hash for all practical purposes.

One last point: often, redundancy is introduced in configuration variables. Consider:

$cgi_dir  = '/home/httpd/perl';
$docs_dir = '/home/httpd/docs';
$img_dir  = '/home/httpd/docs/images';

It's obvious that the base path /home/httpd should be moved to a separate variable, so only that variable needs to be changed if the application is moved to another location on the filesystem.

$base     = '/home/httpd';
$cgi_dir  = "$base/perl";
$docs_dir = "$base/docs";
$img_dir  = "$docs_dir/images";

This cannot be done with a hash, since we cannot refer to its values before the definition is completed. That is, this will not work:

%c = (
   base => '/home/httpd',
   dir => {
       cgi  => "$c{base}/perl",
       docs => "$c{base}/docs",
       img  => "$c{base}{docs}/images",
      },
  );

But nothing stops us from adding additional variables that are lexically scoped with my( ). The following code is correct:

my $base = '/home/httpd';
%c = (
   dir => {
       cgi  => "$base/perl",
       docs => "$base/docs",
       img  => "$base/docs/images",
      },
  );

We've learned how to write configuration files that are easy to maintain, and how to save memory by avoiding importing variables in each script's namespace. Now let's look at reloading those files.