[Linux-cluster] weird happenings on my cluster and another panic.
jason at monsterjam.org
jason at monsterjam.org
Sat Oct 28 03:14:02 UTC 2006
Im 99% sure that these disks are in the shared/clusterd mode.
Ill update my rpms from http://mirror.centos.org/centos/4/csgfs/i386/RPMS/
and see what I get.
Jason
On Fri, Oct 27, 2006 at 12:06:01PM -0400, Lon Hohberger wrote:
> On Thu, 2006-10-26 at 21:03 -0400, jason at monsterjam.org wrote:
>
> > Oct 25 20:31:14 tf1 rpcidmapd: rpc.idmapd startup succeeded
> > Oct 25 20:31:14 tf1 kernel: Vendor: DELL Model: PERC 4/DC Rev: 351X
> > Oct 25 20:31:14 tf1 kernel: Type: Processor ANSI SCSI revision: 02
> > Oct 25 20:31:14 tf1 kernel: scsi[1]: scanning scsi channel 1 [Phy 1] for non-raid devices
> > Oct 25 20:31:14 tf1 kernel: Vendor: DELL Model: PERC 4/DC Rev: 351X
> > Oct 25 20:31:14 tf1 kernel: Type: Processor ANSI SCSI revision: 02
> > Oct 25 20:31:14 tf1 kernel: Vendor: DELL Model: PV22XS Rev: E.17
> > Oct 25 20:31:14 tf1 kernel: Type: Processor ANSI SCSI revision: 03
> > Oct 25 20:31:14 tf1 kernel: scsi[1]: scanning scsi channel 2 [virtual] for logical drives
> > Oct 25 20:31:14 tf1 kernel: Vendor: MegaRAID Model: LD 0 RAID5 139G Rev: 351X
> > Oct 25 20:31:14 tf1 kernel: Type: Direct-Access ANSI SCSI revision: 02
> > Oct 25 20:31:14 tf1 kernel: scsi1 (2,0,0) : reservation conflict
>
> Those things are in "cluster mode", right?
>
>
> > Oct 25 20:31:14 tf1 kernel: sdb: asking for cache data failed
> > Oct 25 20:31:14 tf1 kernel: sdb: assuming drive cache: write through
> > Oct 25 20:31:14 tf1 kernel: sdb: sdb1
> > Oct 25 20:31:14 tf1 kernel: Attached scsi disk sdb at scsi1, channel 2, id 0, lun 0
> > Oct 25 20:31:14 tf1 kernel: Adaptec aacraid driver (1.1-5[2412])
> > Oct 25 20:31:14 tf1 kernel: device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel at redhat.com
> > Oct 25 20:31:14 tf1 kernel: EXT3-fs: INFO: recovery required on readonly filesystem.
> > Oct 25 20:31:14 tf1 kernel: EXT3-fs: write access will be enabled during recovery.
> >
> > so sdb is the gfs volume and is already locked by the other server at this point is my guess.
>
> GFS doesn't do SCSI reservations. Both nodes need concurrent write
> access to the disks. More to the point, see below...
>
> > Oct 25 20:36:13 tf1 kernel: ------------[ cut here ]------------
> > ...
> > Oct 25 20:36:13 tf1 kernel: <0>Fatal exception: panic in 5 seconds
>
> ^^^ Argh.
>
> > so my question now is that it appears that I have something misconfigured.. tf1 should come up as secondary while tf2 is running as
> > primary, right? or should tf1 come up and take over as primary and tf2 let him?
>
> Irrespective of anything you did (or didn't do), the panic above is a
> bug in cman (or maybe the kernel, but not likely).
>
> ... The node panicked trying to start up the cluster software, before
> GFS (or rgmanager, or dlm) was even in the picture. You'll note that in
> the modules list, 'gfs' and 'dlm' are not even listed.
>
> I hope the newer cman-kernel / dlm-kernel fixes it ;)
>
> -- Lon
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
--
================================================
| Jason Welsh jason at monsterjam.org |
| http://monsterjam.org DSS PGP: 0x5E30CC98 |
| gpg key: http://monsterjam.org/gpg/ |
================================================
More information about the Linux-cluster
mailing list