[Linux-cluster] Re: how to get my cluster working if my /dev/sda becomes /dev/sdb (CLVM, iscsi, ERROR: Module iscsi_sfnet in use)

Matt Harrington mharrington at eons.com
Fri Sep 5 13:30:15 UTC 2008


If the problem is device naming, use multipath to create a 
/dev/mapper/something static device name which will always map to a 
particular disk independent of load order.

Anuj Singh (अनुज) wrote:
> Thanks,
> changed script a bit, things working now. resetting iscsi service.
> But device name order independent will be better.
>
> Thanks and regards
> Anuj Singh
>
>
>
> On Fri, Sep 5, 2008 at 5:02 PM, Anuj Singh (अनुज) <anujhere at gmail.com 
> <mailto:anujhere at gmail.com>> wrote:
>
>     Hi,
>     I configured a cluster using gfs1 on rhel-4 kernel version 
>     2.6.9-55.16.EL.
>     Using iscsi-target and initiator.
>     gfs1 mount is exported via nfs service.
>
>     I can manually stop all services in following sequence:
>     nfs, portmap, rgmanager, gfs, clvmd, fenced, cman, ccsd.
>     to stop my iscsi service first I give 'vgchange -aln' then I stop
>     iscsi service, otherwise i get an error of module in use, as I
>     have an clusterd lvm over iscsi device (/dev/sda1)
>
>     Everything works fine, but when i am trying to simulate a possible
>     problem, f.e. iscsi service is stopped I get following error.
>
>     Test1:
>     When cluster is working I stop iscsi service with
>      /etc/init.d/iscsi stop
>     Searching for iscsi-based multipath maps
>     Found 0 maps
>     Stopping iscsid:                                           [  OK  ]
>     Removing iscsi driver: ERROR: Module iscsi_sfnet is in use
>                                                                [FAILED]
>     To stop my iscsi service without a failure,  I stop all cluster
>     services as follows.
>     /etc/init.d/nfs stop
>     /etc/init.d/portmap stop
>     /etc/init.d/rgmanager stop
>     /etc/init.d/gfs stop
>     /etc/init.d/clvmd stop
>     /etc/init.d/fenced stop
>     /etc/init.d/cman stop
>     /etc/init.d/ccsd stop
>     Every service stops with a ok message. now again when i stop my
>     iscsi service I get same error
>      /etc/init.d/iscsi stop
>     Removing iscsi driver: ERROR: Module iscsi_sfnet is in
>     use                             [FAILED]
>
>     On my iscsi device (which is /dev/sd1), i have a LVM with gfs1
>     file-system,
>     as all the cluster services are stopped, I try to deactivate the
>     lvm with:
>
>      vgchange -aln
>       /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error
>       No volume groups found
>
>     At the moment if I start my iscsi service, my /dev/sda becomes
>     /dev/sdb as well as iscsi service gives me following error:
>
>     [root at pr0031 new]# /sbin/service iscsi start
>     Checking iscsi config:                                     [  OK  ]
>     Loading iscsi driver:                                      [  OK  ]
>     mknod: `/dev/iscsictl': File exists
>     Starting iscsid:                                           [  OK  ]
>
>     Sep  5 16:42:37 pr0031 iscsi: iscsi config check succeeded
>     Sep  5 16:42:37 pr0031 iscsi: Loading iscsi driver:  succeeded
>     Sep  5 16:42:42 pr0031 iscsid[20732]: version 4:0.1.11-7 variant
>     (14-Apr-2008)
>     Sep  5 16:42:42 pr0031 iscsi: iscsid startup succeeded
>     Sep  5 16:42:42 pr0031 iscsid[20736]: Connected to Discovery
>     Address 192.168.10.199 <http://192.168.10.199>
>     Sep  5 16:42:42 pr0031 kernel: iscsi-sfnet:host16: Session established
>     Sep  5 16:42:42 pr0031 kernel: scsi16 : SFNet iSCSI driver
>     Sep  5 16:42:42 pr0031 kernel:   Vendor: IET       Model:
>     VIRTUAL-DISK      Rev: 0  
>     Sep  5 16:42:42 pr0031 kernel:   Type:  
>     Direct-Access                      ANSI SCSI revision: 04
>     Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: 1975932 512-byte
>     hdwr sectors (1012 MB)
>     Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: drive cache: write
>     through
>     Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: 1975932 512-byte
>     hdwr sectors (1012 MB)
>     Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: drive cache: write
>     through
>     Sep  5 16:42:42 pr0031 kernel:  sdb: sdb1
>     Sep  5 16:42:42 pr0031 kernel: Attached scsi disk sdb at scsi16,
>     channel 0, id 0, lun 0
>     Sep  5 16:42:43 pr0031 scsi.agent[20764]: disk at
>     /devices/platform/host16/target16:0:0/16:0:0:0
>
>     As my /dev/sda1 became /dev/sdb1, if i start cluster services, I
>     have no gfs mount.
>
>     clurgmgrd[21062]: <notice> Starting stopped service flx
>     Sep  5 16:47:16 pr0031 kernel: scsi15 (0:0): rejecting I/O to dead
>     device
>     Sep  5 16:47:16 pr0031 clurgmgrd: [21062]: <err> 'mount -t gfs 
>     /dev/mapper/VG01-LV01 /u01' failed, error=32
>     Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <notice> start on
>     clusterfs:gfsmount_u01 returned 2 (invalid argument(s))
>     Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <warning> #68: Failed to
>     start flx; return value: 1
>     Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <notice> Stopping service
>     flx
>
>
>     After the above situation I need to restart the nodes, which I
>     don't want to, I created a script to handle all this, in which if
>     i restart all the services first, first I get the same /dev/sdb (
>     which should be /dev/sda so that my cluster can have a gfs mount).
>     When I restart all the services second time, I get no error (this
>     time iscsi disk is attached with /dev/sda device name and I don't
>     see any /dev/iscsctl exist error at the iscsi startup time) and
>     cluster starts working.
>     my script : http://www.grex.org/~anuj/cluster.txt
>     <http://www.grex.org/%7Eanuj/cluster.txt>
>
>     So, how to get my cluster working if my /dev/sda becomes /dev/sdb?
>
>     Thanks and Regards
>     Anuj Singh
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080905/c541dea9/attachment.htm>


More information about the Linux-cluster mailing list