<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Hi,<br>
</p>
<div class="moz-cite-prefix">On 06/26/2017 11:27 PM, Eric Ren wrote:<br>
</div>
<blockquote cite="mid:a2dcb2e0-cba2-6776-f041-605a66486ec5@suse.com"
type="cite">Hi Mike,
<br>
<br>
<br>
On 06/26/2017 10:37 PM, Mike Snitzer wrote:
<br>
<blockquote type="cite">On Mon, Jun 26 2017 at 9:47am -0400,
<br>
Eric Ren <a class="moz-txt-link-rfc2396E" href="mailto:zren@suse.com"><zren@suse.com></a> wrote:
<br>
[...snip...]
<br>
<blockquote type="cite">
<blockquote type="cite">"""
<br>
Revert "dm mirror: use all available legs on multiple
failures"
<br>
dm io: fix duplicate bio completion due to missing ref count
<br>
</blockquote>
I have a confusion about this "dm io..." fix. The fix itself
is good.
<br>
<br>
Without it, a mkfs.ext4 on a mirrored dev whose primary mirror
dev
<br>
has failed, will crash the kernel with the discard operation
from mkfs.ext4.
<br>
However, mkfs.ext4 can succeed on a healthy mirrored device.
This
<br>
is the thing I don't understand, because no matter the
mirrored device is
<br>
good or not, there's always a duplicate bio completion before
having this
<br>
this fix, thus write_callback() will be called twice, crashing
will
<br>
occur on the
<br>
second write_callback():
<br>
</blockquote>
No, there is only a duplicate bio completion if the error path
is taken
<br>
(e.g. underlying device doesn't support discard).
<br>
</blockquote>
Hmm, when "op == REQ_OP_DISCARD", please see comments in
do_region():
<br>
<br>
"""
<br>
static void do_region(int op, int op_flags, unsigned region,
<br>
struct dm_io_region *where, struct dpages
*dp,
<br>
struct io *io)
<br>
{
<br>
...
<br>
if (op == REQ_OP_DISCARD)
<br>
special_cmd_max_sectors =
q->limits.max_discard_sectors;
<br>
...
<br>
if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES ||
<br>
op == REQ_OP_WRITE_SAME) &&
special_cmd_max_sectors == 0) {
<br>
atomic_inc(&io->count); ===>
[1]
<br>
dec_count(io, region, -EOPNOTSUPP); ===>
[2]
<br>
return;
<br>
}
<br>
"""
<br>
<br>
[1] ref count fixed by patch "dm io: ...";
<br>
[2] we won't come here if "special_cmd_max_sectors != 0", which is
true when both sides
<br>
of the mirror is good.
<br>
<br>
So only when a mirror device fails, "max_discard_sectors" on its
queue is 0, thus this error path
<br>
will be taken, right?
<br>
</blockquote>
<br>
Yes, I checked this as below:<br>
<br>
- print info<br>
"""<br>
<div id="magicdomid251" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">@@ -310,8
+310,10 @@ static void do_region(int op, int op_flags, unsigned
region,</span></div>
<div id="magicdomid252" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq"> /*</span></div>
<div id="magicdomid253" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq"> *
Reject unsupported discard and write same requests.</span></div>
<div id="magicdomid254" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq"> */</span></div>
<div id="magicdomid255" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">- if
(op == REQ_OP_DISCARD)</span></div>
<div id="magicdomid256" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">+ if
(op == REQ_OP_DISCARD) {</span></div>
<div id="magicdomid257" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">
special_cmd_max_sectors = q->limits.max_discard_sectors;</span></div>
<div id="magicdomid258" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">+
DMERR("do_region: op=DISCARD,
q->limits.max_discard_sectors=%d\n",
special_cmd_max_sectors);</span></div>
<div id="magicdomid259" class="ace-line"><span
class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">+ }</span></div>
"""<br>
- log message when mkfs on mirror device that primary leg fails:<br>
<span class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">[
169.672880] device-mapper: io: do_region: op=DISCARD,
q->limits.max_discard_sectors=0 ===> for the failed leg</span><br>
<span class="author-a-z67zz71zqnz84z9mz78zkoz72znz65zuyq">[
169.672885] device-mapper: io: do_region: op=DISCARD,
q->limits.max_discard_sectors=8388607 ===> for the good
leg<br>
<br>
Thanks,<br>
Eric<br>
</span>
<blockquote cite="mid:a2dcb2e0-cba2-6776-f041-605a66486ec5@suse.com"
type="cite">
<br>
<blockquote type="cite">
<br>
<blockquote type="cite">"""
<br>
static void write_callback(unsigned long error, void *context)
<br>
{
<br>
unsigned i;
<br>
struct bio *bio = (struct bio *) context;
<br>
struct mirror_set *ms;
<br>
int should_wake = 0;
<br>
unsigned long flags;
<br>
<br>
ms = bio_get_m(bio)->ms; ====> NULL
pointer at the
<br>
duplicate completion
<br>
bio_set_m(bio, NULL);
<br>
"""
<br>
<br>
If no this fix, I expected the DISCARD IO would always crash
the
<br>
kernel, but it's not true when
<br>
the mirrored device is good. Hope someone happen to know the
reason
<br>
can give some hints ;-P
<br>
</blockquote>
If the mirror is healthy then only one completion is returned to
<br>
dm-mirror (via write_callback). The problem was the error patch
wasn't
<br>
managing the reference count as needed. Whereas dm-io's normal
discard
<br>
IO path does.
<br>
</blockquote>
<br>
Yes, I can understand this.
<br>
<br>
Thanks a lot,
<br>
Eric
<br>
<br>
<blockquote type="cite">
<br>
Mike
<br>
<br>
--
<br>
dm-devel mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:dm-devel@redhat.com">dm-devel@redhat.com</a>
<br>
<a class="moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/dm-devel">https://www.redhat.com/mailman/listinfo/dm-devel</a>
<br>
<br>
</blockquote>
<br>
--
<br>
dm-devel mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:dm-devel@redhat.com">dm-devel@redhat.com</a>
<br>
<a class="moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/dm-devel">https://www.redhat.com/mailman/listinfo/dm-devel</a>
<br>
</blockquote>
<br>
</body>
</html>