[dm-devel] [PATCH 2/6] dm crypt: Handle DM_CRYPT_NO_*_WORKQUEUE more explicit.

Thu Mar 11 21:28:04 UTC 2021

On Thu, Mar 11, 2021 at 6:25 PM Mike Snitzer <snitzer at redhat.com> wrote:
>
> On Sat, Feb 13 2021 at  9:31am -0500,
> Ignat Korchagin <ignat at cloudflare.com> wrote:
>
> > On Sat, Feb 13, 2021 at 11:11 AM Sebastian Andrzej Siewior
> > <bigeasy at linutronix.de> wrote:
> > >
> > > By looking at the handling of DM_CRYPT_NO_*_WORKQUEUE in
> > > kcryptd_queue_crypt() it appears that READ and WRITE requests might be
> > > handled in the tasklet context as long as interrupts are disabled or it
> > > is handled in hardirq context.
> > >
> > > The WRITE requests should always be fed in preemptible context. There
> > > are other requirements in the write path which sleep or acquire a mutex.
> > >
> > > The READ requests should come from the storage driver, likely not in a
> > > preemptible context. The source of the requests depends on the driver
> > > and other factors like multiple queues in the block layer.
> >
> > My personal opinion: I really don't like the guesswork and
> > assumptions. If we want
> > to remove the usage of in_*irq() and alike, we should propagate the execution
> > context from the source. Storage drivers have this information and can
> > pass it on to the device-mapper framework, which in turn can pass it
> > on to dm modules.
>
> I'm missing where DM core has the opportunity to convey this context in
> a clean manner.

Does DM core currently even have this context from the drivers?

> Any quick patch that shows the type of transform you'd like to see would
> be appreciated.. doesn't need to be comprehensive, just enough for me or
> others to carry through to completion.

I didn't think it through well, but from the top of my head maybe the
execution context
info can be passed over between different storage layers in the bio
structure? For example,
if a driver completes a read in interrupt context - it sets a flag in
the bio structure and passes
it up the stack. Later, if an intermediate layer changes the execution
context (for example,
dm-crypt offloading the bio processing to a workqueue), that layer
updates the flag and so
on. The same applies to write path: writes are generally started in a
preemptible context, but
if we have some obscure DM module, which will schedule a tasklet for a
write, that module must
update the flag in the bio structure.

Basically, the idea is that a bio processing code will get the current
execution context from an upper/lower
layer and if the code itself changes the execution context, that code
is able to update the execution context
info in the bio before passing it on.

This thinking may be flawed of course as I don't know enough details
about the Linux storage layers and how
well the ownership of bios are defined.

> > Assuming WRITE requests are always in preemptible context might break with the
> > addition of some new type of obscure storage hardware.
> >
> > In our testing we saw a lot of cases with SATA disks, where READ requests come
> > from preemptible contexts, so probably don't want to pay (no matter how small)
> > tasklet setup overhead, not to mention executing it in softirq, which
> > is hard later to
> > attribute to a specific process in metrics.
> >
> > In other words, I think we should be providing support for this in the
> > device-mapper
> > framework itself, not start from individual modules.
>
> I think your concerns are valid... it does seem like this patch is
> assuming too much.
>
> Mike
>

Ignat