How to use lock files

The problem

Sometimes, you need to coordinate the execution of several programs. For example, you might have two programs that access a shared resource, and you don't want them stepping on each others toes. Or you might have a program that runs periodically, e.g. from a crontab
* * * * * my-program
and you don't want to have the next instance run before the current instance has finished.

Lock files

One way to solve this problem is with lock files.

A lock file is an ordinary file. The program

flock is an operating system facility. It marks that file as being locked by that program. If another program tries to flock the same file, the flock call will (at the caller's option) either or So only one program at a time can hold the lock. This gives programs a way to stay out of each others way.

flock is a simple facility, but it is sufficient for many applications.

Implementation

Here is Perl code that acquires a lock
use Fcntl qw(:DEFAULT :flock);

sub Lock
{
    my $lock = "/var/lock/my-application";
    sysopen LOCK, $lock, O_CREAT or die "Can't sysopen $lock: $!";
    flock LOCK, LOCK_EX or die "Can't flock $lock: $!";
}
When you use flock, you have to make some implementation decisions.

The lock file

The lock file can be anywhere. The standard—and preferred—place is under /var/lock/. This directory exists on all Linux systems for the express purpose of holding lock files. If you anticipate creating multiple lock files, consider naming the lock files after the resources that they protect, and then organizing them into subdirectories named after applications and/or organizations
    my $dir  = "/var/lock/my-application";
    my $dir  = "/var/lock/my-organization/my-application";
    -d $dir or mkdir $dir or die "Can't mkdir $dir: $!\n";

    my $lock = "$dir/my-resource";
    ...
The contents of the lockfile are irrelevant; lockfiles are typically empty.

sysopen

Open the lock file with sysopen. The O_CREAT flag causes the file to be created if it doesn't already exist. sysopen is preferred to open, because it can return a filehandle that is open for neither read nor write. This reduces the chance that your program will fail because it lacks permissions on the lock file. If the sysopen call does fail, you have a fatal error.

flock

The flock call acquires the lock. As shown above, it blocks until the lock is available. If you don't want to block, write
    flock LOCK, LOCK_EX | LOCK_NB
Then flock will return true if it acquired the lock, and false if it did not. It is up to the caller to handle both cases appropriately.

unlocking

It is considered good general programming practice for applications to release whatever resources they acquire. This leads some programmers to release locks at the end of program execution with code like
    flock LOCK, LOCK_UN or Die "Can't release flock on $lock: $!";
    close LOCK;
However, this is often unnecessary. A program only needs to release a lock if it wants to grant other programs access to the protected resource while the program that released the lock continues to run.

If a program needs exclusive access to the shared resource for the duration of its own execution, then it can simply acquire the lock and forget about it. When the program exits, the OS automatically releases all locks acquired by the program and closes all files opened by the program. Code that releases locks immediately before program exit is superfluous if it works correctly, and (obviously) bad if it doesn't.

Recommend best practice is for programs to only release locks if there is some performance benefit to be gained, for example, by allowing other programs access to shared resources sooner than they would otherwise have it. In particular, a program that wants to prevent more than one instance of itself from running at a time need not—and probably should not—release its own locks.

DON'T unlink

This same impulse to release resources leads some programmers to unlink the lock file.

NEVER DO THIS

unlinking the lock file creates the very race conditions that flock is designed to prevent. Observe:

programoperationresultnotes
Acreates the lock file 
Aopens the lock file 
Acalls flockacquires the lock
Bopens the lock file 
Bcalls flockblocks
Areleases the lock 
Bacquires the lock
Aunlinks the lock file BAD
Ccreates a new lock file with the same name 
Copens the new lock file 
Ccalls flockacquires a lock on the new lock file

Programs B and C have now acquired locks on two different lock files. They are running at the same time against the shared resource, and are liable to corrupt it.

Note that the time window for this race to occur is substantial. All that is necessary is that program B call flock while program A holds the lock, and program C call flock while program B holds the lock. In practice, those two time intervals could be many seconds long.


Steven W. McDougall / resume / swmcd@theworld.com / 2010 Dec 26