[Linux-cluster] umount failed - device is busy
Herta Van den Eynde
herta.vandeneynde at cc.kuleuven.be
Mon Oct 10 15:59:34 UTC 2005
Further investigation suggests that locking may have something to do
with this.
On the system that currently runs the services, I find these lock files
in four
-rwx------ 1 root root 0 Oct 8 03:30 lock.0
-rwx------ 1 root root 0 Oct 8 03:30 lock.1
-rwx------ 1 root root 0 Oct 8 03:30 lock.116
-rwx------ 1 root root 0 Oct 8 03:30 lock.2
-rw-r--r-- 1 root root 0 Oct 8 03:31 service.0
-rw-r--r-- 1 root root 0 Oct 10 16:08 service.1
-rw-r--r-- 1 root root 0 Oct 8 03:30 service.2
On the now idel cluster member, I have these lock files:
-rwx------ 1 root root 0 Oct 8 03:30 lock.0
-rwx------ 1 root root 0 Oct 8 03:30 lock.1
-rwx------ 1 root root 0 Oct 8 03:30 lock.116
-rwx------ 1 root root 0 Oct 8 03:30 lock.2
The four lock.n files strike me as odd since I only have three services.
Also, should the lock files even be there on the idle cluster member?
Could anyone running a similar cluster please post the content of the
/var/lock/clumanager/ of the different members along with the the number
of services currently running on that member?
Kind regards,
Herta
Herta Van den Eynde wrote:
> environment:
> - Red Hat AS 3 (kernel-smp-2.4.21-37.EL - custom built to probe all LUNs
> on each SCSI device)
> - clumanager 1.2.28
>
> The cluster consists of 2 members running three services which simply
> nfs export a number of directories to five other systems.
> The cluster has been operational since February.
>
> Following the latest upgrade (from kernel-smp-2.4.21-32.0.1.EL custom
> built and clumanager-1.2.26.1-1), all services are running on one
> member. When I try to locate the services, the operation fails, and the
> following message pops up:
>
> A Problem has occurred while changing ownership
> of this service. Please check logs for details.
>
> The cluster log reports the following:
>
> ==== begin log extract
> Member arnebd trying to relocate lepustl to nihald...Oct 10 16:08:06
> arnebd clusvcmgrd: [13627]: <notice> service notice: Stopping service
> lepustl ...
> Oct 10 16:08:06 arnebd clurmtabd[26429]: <debug> Signal 15 received;
> exiting
> Oct 10 16:08:12 arnebd clusvcmgrd: [13627]: <err> service error: 'umount
> /dev/sdb2' failed (/usr/local/lepus-tl), error=1
> Oct 10 16:08:12 arnebd clusvcmgrd: [13627]: <err> service error: umount:
> /usr/local/lepus-tl: device is busy
> Oct 10 16:08:12 arnebd clusvcmgrd: [13627]: <err> service error: umount:
> /usr/local/lepus-tl: device is busy
> Oct 10 16:08:12 arnebd clusvcmgrd: [13627]: <err> service error: Cannot
> stop filesystems for lepustl
> Oct 10 16:08:12 arnebd clusvcmgrd[13626]: <notice> Starting stopped
> service lepustl
> Oct 10 16:08:12 arnebd clusvcmgrd: [14083]: <notice> service notice:
> Starting service lepustl ...
> Oct 10 16:08:12 arnebd clurmtabd[14194]: <debug> Log level is now 7
> Oct 10 16:08:12 arnebd clurmtabd[14194]: <debug> Polling interval is now
> 4 seconds
> failed
> Oct 10 16:08:12 arnebd clusvcmgrd: [14083]: <notice> service notice:
> Started service lepustl ...
> Oct 10 16:08:14 arnebd clurmtabd[6533]: <debug> Detected modified
> /var/lib/nfs/rmtab
> Oct 10 16:08:14 arnebd clurmtabd[9655]: <debug> Detected modified
> /var/lib/nfs/rmtab
> ==== end log extract
>
> FWIIW, no one was logged in but me, and my current directory was not on
> this filesystem.
> Neither fuser nor lsof returned any process using the filesystem.
> I figured the clurmtabd process may be locking it, so I did verify that
> there is only one clurmtab process for that filesystem.
>
> Any ideas/suggestions?
>
> Kind regards,
>
> Herta
>
--
Herta Van den Eynde -=- Toledo system management
K.U. Leuven - Ludit -=- phone: +32 (0)16 322 166
-=- 50°51'27" N 004°40'39" E
"I wish I were two little cats. Then I could play together."
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
More information about the Linux-cluster
mailing list