[dm-devel] [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE
Mikulas Patocka
mpatocka at redhat.com
Wed Mar 21 20:37:08 UTC 2018
On Wed, 21 Mar 2018, Christopher Lameter wrote:
> On Wed, 21 Mar 2018, Mikulas Patocka wrote:
>
> > For example, if someone creates a slab cache with the flag SLAB_CACHE_DMA,
> > and he allocates an object from this cache and this allocation races with
> > the user writing to /sys/kernel/slab/cache/order - then the allocator can
> > for a small period of time see "s->allocflags == 0" and allocate a non-DMA
> > page. That is a bug.
>
> True we need to fix that:
>
> Subject: Avoid potentially visible allocflags without all flags set
>
> During slab size recalculation s->allocflags may be temporarily set
> to 0 and thus the flags may not be set which may result in the wrong
> flags being passed. Slab size calculation happens in two cases:
>
> 1. When a slab is created (which is safe since we cannot have
> concurrent allocations)
>
> 2. When the slab order is changed via /sysfs.
>
> Signed-off-by: Christoph Lameter <cl at linux.com>
>
>
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c
> +++ linux/mm/slub.c
> @@ -3457,6 +3457,7 @@ static void set_cpu_partial(struct kmem_
> static int calculate_sizes(struct kmem_cache *s, int forced_order)
> {
> slab_flags_t flags = s->flags;
> + gfp_t allocflags;
> size_t size = s->object_size;
> int order;
>
> @@ -3551,16 +3552,17 @@ static int calculate_sizes(struct kmem_c
> if (order < 0)
> return 0;
>
> - s->allocflags = 0;
> + allocflags = 0;
> if (order)
> - s->allocflags |= __GFP_COMP;
> + allocflags |= __GFP_COMP;
>
> if (s->flags & SLAB_CACHE_DMA)
> - s->allocflags |= GFP_DMA;
> + allocflags |= GFP_DMA;
>
> if (s->flags & SLAB_RECLAIM_ACCOUNT)
> - s->allocflags |= __GFP_RECLAIMABLE;
> + allocflags |= __GFP_RECLAIMABLE;
>
> + s->allocflags = allocflags;
I'd also use "WRITE_ONCE(s->allocflags, allocflags)" here and when writing
s->oo and s->min to avoid some possible compiler misoptimizations.
WRITE_ONCE should be used when writing a value that can be read
simultaneously (but a lot of kernel code misses it).
Another problem is that it updates s->oo and later it updates s->max:
s->oo = oo_make(order, size, s->reserved);
s->min = oo_make(get_order(size), size, s->reserved);
if (oo_objects(s->oo) > oo_objects(s->max))
s->max = s->oo;
--- so, the concurrently running code could see s->oo > s->max, which
could trigger some memory corruption.
s->max is only used in memory allocations -
kmalloc(BITS_TO_LONGS(oo_objects(s->max)) * sizeof(unsigned long)), so
perhaps we could fix the bug by removing s->max at all and always
allocating enough memory for the maximum possible number of objects?
- kmalloc(BITS_TO_LONGS(oo_objects(s->max)) * sizeof(unsigned long), GFP_KERNEL);
+ kmalloc(BITS_TO_LONGS(MAX_OBJS_PER_PAGE) * sizeof(unsigned long), GFP_KERNEL);
Mikulas
> /*
> * Determine the number of objects per slab
> */
>
More information about the dm-devel
mailing list