[Cluster-devel] Re: [DLM] Fix up memory alloc/kmap

Steven Whitehouse swhiteho at redhat.com
Thu Nov 13 09:56:18 UTC 2008


Hi,

On Wed, 2008-11-12 at 17:14 -0600, David Teigland wrote:
> On Tue, Nov 11, 2008 at 11:02:57AM +0000, Steven Whitehouse wrote:
> > In addition kmap was being used incorrectly. The reason that this hasn't
> > been seen before is that the pages are allocated using ls->ls_allocation
> > which doesn't allow highmem pages to be returned. I haven't changed that
> > since although kmap implies a GFP_KERNEL allocation, without highmem
> > pages, they will be no-ops for now and thus safe. Also sendpage does its
> > own kmap (if required, which is only if the protocol is emulating
> > sendpage via sock_no_sendpage) so that the k(un)map pair around that
> > call can be removed.
> 
> Sorry, I don't understand this, it's late and I'm probably being dense...
> 
> - We can remove kmap/kunmap from send_to_sock() because...
> 
> - We can remove kmap/kunmap from sctp_init_assoc() because...
> 
> - We cannot remove kmap/kunmap from dlm_lowcomms_get_buffer/
>   dlm_lowcomms_commit_buffer because...
> 

kmap is one of the odder kernel interfaces, and it also has the extra
confusion of having a different interface between the atomic and
"normal" versions of itself. Worse still, due to the void pointer, you
don't get a warning that you got it wrong until you try to unmap the
page in question and it fails.

So here is a quick run down of the rules, and perhaps it will make more
sense after this.... firstly kmap of pages which are not highmem pages
will always be a no-op, so unless the pages which are being used have
been allocated with __GFP_HIGHMEM, then kmap will always be a no-op.
Both GFP_KERNEL and GFP_NOFS do not imply __GFP_HIGHMEM (see gfp.h) so
that you'll never need to kmap pages allocated in this way.

Now most kernel calls require that high mem pages are mapped before they
are called, but this is not the case with sendpage. In general the data
will not be even looked at by the kernel, but the page gets added to the
page vector of the skb for the outgoing packet, and its up to the device
driver to do any mapping required. If there is an exception to that, its
probably due to netfilter inspecting the packets.

kmap itself works by (when required, due to a page being in highmem)
allocating (GFP_KERNEL) a low memory page and copying the data into it.
It thus means that every time you see a kmap() there is a potential
GFP_KERNEL allocation, and the point that I was making above is that
(assuming I've read the code correctly) that the GFP_KERNEL allocation
will never happen, because the page can never be a highmem page anyway.

I did consider the idea of using kmap_atomic() instead. In that case
instead of allocating a page, it uses one of two existing pre-allocated
pages (per CPU) to copy the data into temporarily. The gotcha in this
case is that because the pages are per-CPU, you must not schedule
between the kmap/kunmap pair and you also have to be careful not to have
a code path where the same kmap atomic page can be used twice. Since in
this case there was a potential scheduling point in the code, that meant
that I couldn't change kmap to kmap_atomic.

So the points I was making above were that kmap isn't required around
sendpage, because sendpage will happily take unmapped pages. You can see
that by looking at sock_no_sendpage() which emulates sendpage for those
protocols which don't define their own method. Also, that in the other
cases, the reason that we've never run into any problems with GFP_KERNEL
allocations occurring is that the kmaps in question have never been
supplied with pages in highmem.

It is left as an exercise for the reader to consider whether its a bug
that DLM isn't using highmem pages for its internal buffers (in which
case we'd have to solve the allocation problem at kmap time), or whether
its a bug that the kmap/kunmap pairs are there at all (and can thus be
removed, which is the simpler solution) :-)

Steve.





More information about the Cluster-devel mailing list