I've been working on a project that requires a lot of INSERTs, one for every pageview plus a few UPDATEs of summary tables. Since we want this thing to scale up to millions of pageviews a day, I was concerned about all those INSERTs slowing down the client experience. But, since the client doesn't need up-to-the-second data, I knew the problem could be solved by some kind of queueing.

First I thought of writing all the SQL queries to a file, then running a cron job against it every so often. But this approach has a couple of downsides. The data in the DB would always be somewhat stale, even when server load was light, and it would require rewriting some of my existing, proven code.

A better approach is to fork a process and let the OS handle scheduling. Intead of waiting for the DB, the executing web script forks a process to perform the slow INSERT and returns early to the client -- fast, smooth, and little reprogramming.

In PHP, the listed way to fork is pcntl_fork. But few people can do this because it is not compiled into PHP by default. Thankfully, someone found a trick that accomplishes the same thing by using the command shell and the PHP command line interpreter. I like to think of it as poor man's forking. Here it is.


function poor_mans_fork($path, $args=array()) {  
foreach ($args as $key => $value) {
$key = escapeshellarg($key);
$value = escapeshellarg($value);
@$arg_string .= "--$key=$value";
}

$cmd = "/usr/bin/php " . escapeshellarg(realpath($path))
. ' ' . $arg_string;
exec($cmd . " > /dev/null &");
}

Previously my recording script took all its input in the $_REQUEST variable. To enable it to operate as a forked process, all I needed to do was assign the command-line input contained in $_SERVER['argv'] to $_REQUEST, and provide a please-fork boolean.

I put this code at the top of my PHP script, before the INSERT takes place.

if (!empty($_REQUEST['fork'])) {
unset($_REQUEST['fork']);
poor_mans_fork(__FILE__, $_REQUEST);
die();
}
else {
$_REQUEST = extract_args($_SERVER['argv']);
// continue normal execution...
}

Now when the web script is called, it does what is needed by the client, forks an instance of itself to take care of the rest, and returns promptly to the client. Meanwhile the forked instance happily completes on its own time.

 
First post 05/13/2008
 

In this blog, I'll try to share the small nuggets of wisdom I've managed to wrangle from my daily toils. I'll polish and shine them so you don't ever need to think about the mountains of dirt they were buried under.