[Linux-cluster] flock behavior different between GFS and EXT3

Matthew B. Brookover mbrookov at mines.edu
Sun Aug 12 16:55:32 UTC 2007


I am attempting to move a program using an EXT3 file system to a GFS
file system.  The program uses flock to serialize access between
processes.  On an EXT3 file system I can get an exclusive lock on a
file, make some change to the file, then get a shared lock without
loosing the lock.  On GFS when the program tries to demote from the
exclusive lock to a shared lock, the lock is freed allowing another
process to step in and take the lock.

Is there a way to get flock on GFS to behave the way it does on the EXT3
file system?

I have attached sample C source code and here are instructions to
demonstrate this issue.

My cluster is running GFS 6.1, RHEL 4 update 5 with all of the patches.

Compile both programs: 

[mbrookov at imagine locktest]$ cc -o flock_EX_SH flock_EX_SH.c 
[mbrookov at imagine locktest]$ cc -o flockwritelock flockwritelock.c
[mbrookov at imagine locktest]$ 

EXT3 test:

Start up xterm twice and cd to the directory where you compiled the 2
programs.  On my system, /tmp is an EXT3 file system.

In the first xterm, run 'flock_EX_SH /tmp/bar'  and hit return.  In the
second xterm, run 'flockwritelock /tmp/bar' and hit return.  The
flockwritelock process will block waiting for an exclusive lock on the
file /tmp/bar.

One the first xterm, hit return, the flock_EX_SH process will attempt to
demote the exclusive lock to a shared lock and display a prompt.  The
flockwritelock process on the second xterm will stay blocked.

In the first xterm, hit return again, the flock_EX_SH process will free
the lock, close the file and exit.  The flockwritelock process will then
receive the exclusive lock on /tmp/bar and display a prompt.  Hit return
in the second xterm to get flockwritelock to close and exit.

Output on first xterm: 

[mbrookov at imagine locktest]$ ./flock_EX_SH /tmp/bar
Have exclusive lock, hit return to free write lock on /tmp/bar and exit

Attempt to demote lock on /tmp/bar to shared lock
Have shared lock, hit return to free lock on /tmp/bar and exit

[mbrookov at imagine locktest]$ 

Output on second xterm: 

[mbrookov at imagine locktest]$ ./flockwritelock /tmp/bar
Have write lock, hit return to free write lock on /tmp/bar and exit

[mbrookov at imagine locktest]$




GFS test:

Start up xterm twice and cd to the directory where you compiled the 2
programs.  On my system, the locktest directory is on a GFS file system.

In the first xterm, run 'flock_EX_SH bar'  and hit return.  In the
second xterm, run 'flockwritelock bar' and hit return.  The
flockwritelock process will block waiting for an exclusive lock on the
file bar.

On the first xterm, hit return, the flock_EX_SH process will attempt to
demote the exclusive lock on bar to a shared lock but will fail because
the system call to flock frees the lock allowing the flockwritelock
process to get an exclusive lock.  The flock_EX_SH process will exit.

Hit return on the second xterm, flockwritelock will close bar and exit.

Output on first xterm: 

[mbrookov at imagine locktest]$ ./flock_EX_SH bar
Have exclusive lock, hit return to free write lock on bar and exit

Attempt to demote lock on bar to shared lock
Could not demote to shared lock on file bar, Resource temporarily unavailable
[mbrookov at imagine locktest]$ 

Output on second xterm: 

[mbrookov at imagine locktest]$ ./flockwritelock bar
Have write lock, hit return to free write lock on bar and exit

[mbrookov at imagine locktest]$ 

The results for flock on GFS are the same if you run the two programs on
the same node or on 2 different nodes.  The locks (shared, exclusive,
blocking, non blocking) also work correctly on both file systems.  The
problem is the case where GFS will free the exclusive lock and return an
error instead of demote the exclusive lock to a shared lock.

The program depends on the EXT3 flock behavior -- the exclusive lock can
be demoted to a shared lock without the possibility that another process
that is blocked waiting for an exclusive lock receiving the lock.

Thank you

Matt
mbrookov at mines.edu


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070812/f7b17921/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flock_EX_SH.c
Type: text/x-csrc
Size: 1291 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070812/f7b17921/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flockwritelock.c
Type: text/x-csrc
Size: 1073 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070812/f7b17921/attachment-0001.bin>


More information about the Linux-cluster mailing list