Let's talk about passing variables to a subroutine. There are two ways to do this: you can pass a copy of the variable to the subroutine (this is called passing by value) or you can instead pass a reference to it (a reference is just a pointer, so the variable itself is not copied). Other things being equal, if the copy of the variable is larger than a pointer to it, it will be more efficient to pass a reference.
Let's use the example from the previous section, assuming we have no choice but to read the whole file before any data processing takes place and its size is 5 MB. Suppose you have some subroutine called process( ) that processes the data and returns it. Now say you pass $content by value and process( ) makes a copy of it in the familiar way:
my $content = qq{foobarfoobar}; $content = process($content); sub process { my $content = shift; $content =~ s/foo/bar/gs; return $content; }
You have just copied another 5 MB, and the child has grown in size by another 5 MB. Assuming 20 Apache children, you can multiply this growth again by factor of 20—now you have 200 MB of wasted RAM! This will eventually be reused, but it's still a waste. Whenever you think the variable may grow bigger than a few kilobytes, definitely pass it by reference.
There are several forms of syntax you can use to pass and use variables passed by reference. For example:
my $content = qq{foobarfoobar}; process(\$content); sub process { my $r_content = shift; $$r_content =~ s/foo/bar/gs; }
Here $content is populated with some data and then passed by reference to the subroutine process( ), which replaces all occurrences of the string foo with the string bar. process( ) doesn't have to return anything—the variable $content was modified directly, since process( ) took a reference to it.
If the hashes or arrays are passed by reference, their individual elements are still accessible. You don't need to dereference them:
$var_lr->[$index] get $index'th element of an array via a ref $var_hr->{$key} get $key'th element of a hash via a ref
Note that if you pass the variable by reference but then dereference it to copy it to a new string, you don't gain anything, since a new chunk of memory will be acquired to make a copy of the original variable. The perlref manpage provides extensive information about working with references.
Another approach is to use the @_ array directly. Internally, Perl always passes these variables by reference and dereferences them when they are copied from the @_ array. This is an efficiency mechanism to allow you to write subroutines that take a variable passed as a value, without copying it.
process($content); sub process { $_[0] =~ s/foo/bar/gs; }
From perldoc perlsub:
The array @_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated (or an error occurs if it is not possible to update)...
Be careful when you write this kind of subroutine for use by someone else; it can be confusing. It's not obvious that a call like process($content); modifies the passed variable. Programmers (the users of your library, in this case) are used to subroutines that either modify variables passed by reference or expressly return a result, like this:
$content = process($content);
You should also be aware that if the user tries to submit a read-only value, this code won't work and you will get a runtime error. Perl will refuse to modify a read-only value:
$content = process("string foo");
 
Continue to: