[dm-devel] [PATCH v3 2/2] crypto: lrw - Do not use auxiliary buffer

Ondrej Mosnacek omosnace at redhat.com
Wed Sep 12 07:57:57 UTC 2018


On Wed, Sep 12, 2018 at 8:51 AM Eric Biggers <ebiggers at kernel.org> wrote:
> On Tue, Sep 11, 2018 at 09:42:39AM +0200, Ondrej Mosnacek wrote:
> > This patch simplifies the LRW template to recompute the LRW tweaks from
> > scratch in the second pass and thus also removes the need to allocate a
> > dynamic buffer using kmalloc().
> >
> > As discussed at [1], the use of kmalloc causes deadlocks with dm-crypt.
> >
> > PERFORMANCE MEASUREMENTS (x86_64)
> > Performed using: https://gitlab.com/omos/linux-crypto-bench
> > Crypto driver used: lrw(ecb-aes-aesni)
> >
> > The results show that the new code has about the same performance as the
> > old code. For 512-byte message it seems to be even slightly faster, but
> > that might be just noise.
> >
> > Before:
> >        ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
> >         lrw(aes)     256              64             200             203
> >         lrw(aes)     320              64             202             204
> >         lrw(aes)     384              64             204             205
> >         lrw(aes)     256             512             415             415
> >         lrw(aes)     320             512             432             440
> >         lrw(aes)     384             512             449             451
> >         lrw(aes)     256            4096            1838            1995
> >         lrw(aes)     320            4096            2123            1980
> >         lrw(aes)     384            4096            2100            2119
> >         lrw(aes)     256           16384            7183            6954
> >         lrw(aes)     320           16384            7844            7631
> >         lrw(aes)     384           16384            8256            8126
> >         lrw(aes)     256           32768           14772           14484
> >         lrw(aes)     320           32768           15281           15431
> >         lrw(aes)     384           32768           16469           16293
> >
> > After:
> >        ALGORITHM KEY (b)        DATA (B)   TIME ENC (ns)   TIME DEC (ns)
> >         lrw(aes)     256              64             197             196
> >         lrw(aes)     320              64             200             197
> >         lrw(aes)     384              64             203             199
> >         lrw(aes)     256             512             385             380
> >         lrw(aes)     320             512             401             395
> >         lrw(aes)     384             512             415             415
> >         lrw(aes)     256            4096            1869            1846
> >         lrw(aes)     320            4096            2080            1981
> >         lrw(aes)     384            4096            2160            2109
> >         lrw(aes)     256           16384            7077            7127
> >         lrw(aes)     320           16384            7807            7766
> >         lrw(aes)     384           16384            8108            8357
> >         lrw(aes)     256           32768           14111           14454
> >         lrw(aes)     320           32768           15268           15082
> >         lrw(aes)     384           32768           16581           16250
> >
> > [1] https://lkml.org/lkml/2018/8/23/1315
> >
> > Signed-off-by: Ondrej Mosnacek <omosnace at redhat.com>
> > ---
> >  crypto/lrw.c | 280 ++++++++++-----------------------------------------
> >  1 file changed, 51 insertions(+), 229 deletions(-)
> >
> > diff --git a/crypto/lrw.c b/crypto/lrw.c
> > index b4f30b6f16d6..d5d2fba9af59 100644
> > --- a/crypto/lrw.c
> > +++ b/crypto/lrw.c
> > @@ -29,8 +29,6 @@
> >  #include <crypto/b128ops.h>
> >  #include <crypto/gf128mul.h>
> >
> > -#define LRW_BUFFER_SIZE 128u
> > -
> >  #define LRW_BLOCK_SIZE 16
> >
> >  struct priv {
> > @@ -56,19 +54,7 @@ struct priv {
> >  };
> >
> >  struct rctx {
> > -     be128 buf[LRW_BUFFER_SIZE / sizeof(be128)];
> > -
> > -     be128 t;
> > -
> > -     be128 *ext;
> > -
> > -     struct scatterlist srcbuf[2];
> > -     struct scatterlist dstbuf[2];
> > -     struct scatterlist *src;
> > -     struct scatterlist *dst;
> > -
> > -     unsigned int left;
> > -
> > +     be128 t, orig_iv;
> >       struct skcipher_request subreq;
> >  };
> >
> > @@ -135,86 +121,31 @@ static int next_index(u32 *counter)
> >       return res;
> >  }
> >
> > -static int post_crypt(struct skcipher_request *req)
> > +/*
> > + * We compute the tweak masks twice (both before and after the ECB encryption or
> > + * decryption) to avoid having to allocate a temporary buffer and/or make
> > + * mutliple calls to the 'ecb(..)' instance, which usually would be slower than
> > + * just doing the gf128mul_x_ble() calls again.
> > + */
>
> next_index(), not gf128mul_x_ble().

Fixed for next revision, thanks.

>
> > +static void init_crypt(struct skcipher_request *req)
> >  {
> > +     struct priv *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req));
> >       struct rctx *rctx = skcipher_request_ctx(req);
> > -     struct skcipher_request *subreq;
> > +     struct skcipher_request *subreq = &rctx->subreq;
> >
> > -     subreq = &rctx->subreq;
> > -
> > -     while (!err && rctx->left) {
> > -             err = pre_crypt(req) ?:
> > -                   crypto_skcipher_decrypt(subreq) ?:
> > -                   post_crypt(req);
> > +     skcipher_request_set_tfm(subreq, ctx->child);
> > +     skcipher_request_set_callback(subreq, req->base.flags, crypt_done, req);
> > +     skcipher_request_set_crypt(subreq, req->dst, req->dst,
> > +                                req->cryptlen, &rctx->orig_iv);
>
> Can you leave a comment that the 'iv' of 'subreq' is set to 'orig_iv' is set so
> that it's available in xor_tweak_post()?  My first thought was that the 'iv'
> should be NULL as this same request is also used for the ECB step.
>
> Or alternatively, you could get the IV directly from 'rctx->orig_iv' in
> xor_tweak(), and only save the incremented IV back to 'walk.iv' on the first
> pass.  Then subreq->iv would be left NULL.

Good point, now that we have the explicit first/second pass
distinction this should be quite easy to do and will be also cleaner.

Thanks,
-- 
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.




More information about the dm-devel mailing list