[dm-devel] multipathd segfault and error calling out
John A. Sullivan III
jsullivan at opensourcedevel.com
Thu Feb 26 05:30:47 UTC 2009
On Thu, 2009-02-26 at 00:14 -0500, John A. Sullivan III wrote:
> On Wed, 2009-02-25 at 22:23 -0500, John A. Sullivan III wrote:
> > On Wed, 2009-02-25 at 22:04 -0500, Konrad Rzeszutek wrote:
> > > On Wed, Feb 25, 2009 at 09:07:44PM -0500, John A. Sullivan III wrote:
> > > > Hello, all. I am running on kernel 2.6.27 on CentOS 5.2 with VServer
> > > > and device-mapper-multipath-0.4.7-17.el5. I have a custom
> > > > mpath_prio_ssi script which takes the device name (e.g., sdaa), pulls
> > > > out the path from /etc/disk/by-path and then echos a priority based upon
> > > > a lookup table. It works perfectly fine from the command line.
> > > > multipath -ll shows the priorities assigned perfectly and exactly the
> > > > right paths are active.
> > > >
> > > > However, when I start multipathd, it all goes down the tubes. The paths
> > > > disappear and /var/log/messages is filled with:
> > > > Feb 25 20:50:17 vd01 multipathd: error calling out /usr/local/sbin/mpath_prio_ssi sdh
> > >
> > > Keep in mind that the environment you have when multipathd calls is quite
> > > limited. I believe there is no PATH set, nor any other "normal" values.
> > >
> > > Make sure your code uses absolute paths. So "/bin/grep" ,"/bin/cut", etc..
> > <snip>
> > Thank you. I was enthusiastic that might have been the problem, but
> > alas not. Even with absolute pathnames and setting the PATH variable, it
> > still gives the same error. In fact, I should have mentioned, I created
> > a bogus file with the same pathname which did nothing but "echo hello"
> > and it gave the same error calling out error. What next? - John
> This is increasingly bizarre. I did an strace on the multipath command
> and on the multipathd command.
>
> Here is a portion of the strace for multipath:
> close(1) = 0
> dup(6) = 1
> execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = 0
> brk(0) = 0x8c3000
>
> Here is the same call from multipathd:
> close(1) = 0
> dup(7) = 1
> execve("/usr/local/sbin/mpath_prio_ssi", ["/usr/local/sbin/mpath_prio_ssi", "sda"], [/* 25 vars */]) = -1 ENOENT (No such file or directory)
> exit_group(-1) = ?
>
> Is it my imagination or is it exactly the same call but one is finding
> the file and the other is not. What could cause this? It is an explicit
> pathname and the file exists??!! Thanks - John
I should also mention that the trace shows there is no problem for
multipathd to open the file. Two threads before the failure, we see
this in the strace:
stat("/var/cache/multipathd", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
open("/usr/local/sbin/mpath_prio_ssi", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=368, ...}) = 0
close(4) = 0
So the problem appears to be explicitly with the execve call. How does
one fix this? Thanks - John
--
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan at opensourcedevel.com
http://www.spiritualoutreach.com
Making Christianity intelligible to secular society
More information about the dm-devel
mailing list