[dm-devel] dm-crypt: Fix error with too large bios
Mikulas Patocka
mpatocka at redhat.com
Thu Aug 25 18:34:57 UTC 2016
On Thu, 18 Aug 2016, Eric Wheeler wrote:
> > On Wed, Jun 01 2016 at 9:44am -0400, Christoph Hellwig <hch at infradead.org> wrote:
> >
> > > > > be dm-crypt.c. Maybe you've identified some indirect use of
> > > > > BIO_MAX_SIZE?
> > > >
> > > > I mean the recently introduced BIO_MAX_SIZE in -next tree:
> > > >
> > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/drivers/md/dm-crypt.c?id=4ed89c97b0706477b822ea2182827640c0cec486
> > >
> > > The crazy bcache bios striking back once again. I really think it's
> > > harmful having a _MAX value and then having a minor driver
> > > reinterpreting it and sending larger ones. Until we can lift the
> > > maximum limit in general nad have common code exercise it we really need
> > > to stop bcache from sending these instead of littering the tree with
> > > workarounds.
> >
> > The bio_kmalloc function allocates bios with up to 1024 vector entries (as
> > opposed to bio_alloc and bio_alloc_bioset that has a limit of 256 vector
> > entries).
> >
> > Device mapper is using bio_alloc_bioset with a bio set, so it is limited
> > to 256 vector entries, but other kernel users may use bio_kmalloc and
> > create larger bios.
> >
> > So, if you don't want bios with more than 256 vector entries to exist, you
> > should impose this limit in bio_kmalloc (and fix all the callers that use
> > it).
>
> FYI, Kent Overstreet notes this about bcache from the other thread here:
> https://lkml.org/lkml/2016/8/15/620
>
> [paste]
> >> bcache originally had workaround code to split too-large bios when it
> >> first went upstream - that was dropped only after the patches to make
> >> generic_make_request() handle arbitrary size bios went in. So to do what
> >> you're suggesting would mean reverting that bcache patch and bringing that
> >> code back, which from my perspective would be a step in the wrong
> >> direction. I just want to get this over and done with.
> >>
> >> re: interactions with other drivers - bio_clone() has already been changed
> >> to only clone biovecs that are live for current bi_iter, so there
> >> shouldn't be any safety issues. A driver would have to be intentionally
> >> doing its own open coded bio cloning that clones all of bi_io_vec, not
> >> just the active ones - but if they're doing that, they're already broken
> >> because a driver isn't allowed to look at bi_vcnt if it isn't a bio that
> >> it owns - bi_vcnt is 0 on bios that don't own their biovec (i.e. that were
> >> created by bio_clone_fast).
> >>
> >> And the cloning and bi_vcnt usage stuff I audited very thoroughly back
> >> when I was working on immutable biovecs and such back in the day, and I
> >> had to do a fair amount of cleanup/refactoring before that stuff could go
> >> in.
> [/paste]
>
> They are making progress in the patch-v3 thread, so perhaps this can be
> fixed for now in generic_make_request().
>
> --
> Eric Wheeler
Device mapper can't split the bio in generic_make_request - it frees the
md->queue->bio_split bioset, to save one kernel thread per device. Device
mapper uses its own splitting mechanism.
So what is the final decision? - should device mapper split the big bio or
should bcache not submit big bios?
I think splitting big bios in the device mapper is better - simply because
it is much less code than reworking bcache to split bios internally.
BTW. In the device mapper, we have a layer dm-io, that was created to work
around bio size limitations - it accepts unlimited I/O request and splits
it to several bios. When bio size limitations are gone, we could simplify
dm-io too.
Mikulas
More information about the dm-devel
mailing list