[Cluster-devel] [PATCHv2 dlm/next 0/8] fs: dlm: introduce dlm re-transmission layer

Tue Mar 16 18:37:02 UTC 2021

Hello,

On Fri, 2021-03-12 at 09:52 -0500, Alexander Ahring Oder Aring wrote:
> On Thu, Mar 11, 2021 at 4:09 AM Paolo Abeni <pabeni at redhat.com> wrote:
> > - DEFAULT_BUFFER_SIZE == LOWCOMMS_MAX_TX_BUFFER_LEN in current net-
> > next, so looks like a change below is actually a no op ?!?
> 
> It's LOWCOMMS_MAX_TX_BUFFER_LEN updated in patch 7/8:
> 
> #define LOWCOMMS_MAX_TX_BUFFER_LEN     (DEFAULT_BUFFER_SIZE -
> DLM_MIDCOMMS_OPT_LEN)
> 
> whereas DEFAULT_BUFFER_SIZE is the maximum possible dlm message size
> at socket layer level. Mainly this is limited because we can get a
> maximum page buffer allocation for the dlm application layer only and
> the application layer cannot deal with any additional logic to handle
> more. However that is also an architecture thing, because it needs to
> be the minimal page buffer size of all architecture due compatibility
> with other architectures. I introduced a check which should report
> problems if architectures doesn't support it:
> 
> BUILD_BUG_ON(PAGE_SIZE < DEFAULT_BUFFER_SIZE);
> 
> LOWCOMMS_MAX_TX_BUFFER_LEN is the length after the encapsulation
> header, the size "what the dlm application layer can maximum put in a
> message" on application layer level.
> 
> The names are misnamed, it's on my list to update them. Any
> recommendations (MAX_DLM_SOCKET_BUFFER_SIZE/MAX_DLM_APP_BUFFER_SIZE)?

I'm almost literaly the last person to ask this kind of questions, as I
always pick up bad names... I would opt for:

DLM_MAX_SOCKET_BUFSIZE
DLM_MAX_APP_BUFSIZE

> > - Could you please add more info WRT the reference to unaligned memory
> > access in the code comments? Which field[s] is[are] subject to that?
> > 
> 
> none of these fields. Some of the DLM messages have a variable sized
> payload at the tail. It's a string value. This string length needs to
> be 8 byte aligned. The kernel violate this, however there exists an
> userspace API and the user needs to define this string value. One user
> is lvm, lvm uses an unique identifier the UUID of the volume for this
> string value, the UUID string value is 36 chars. It will end in a
> message boundary issue which will/can end in unaligned memory access
> at the sending side (because of some internal buffer handling) and the
> receiving side. I tried to fix it by filling zero bytes to this string
> but the problem is more complicated as David Teigland pointed out to
> still have backwards compatibility. I would like to schedule to fix
> the issue in a next major version bump of dlm.

Uhm... It looks like the unaligned access was partially handled with
memmove/put_unaligned, but that code was removed with
commit df9e06b800ed025411ce9ab348299b3ef258cf8b - because the removed
code did not cover all the use-case.

I guess a complete solution based on get_unaligned/put_unaligned is too
invasive?

I'm not sure I follow correctly when you says the issue should be
solved on the send side... What if a sender is buggy/uses an old
implementation? would that cause unaligned access?

If a message is unaligned, will it stay unaligned on re-transmissions?
Otherwise, if the unaligned messages are really a corner case, given
that this implements app-layer retransmission, what about simply
detecting the unaligned message and dropping it? It sounds a bit
scaring to me that a remote peer would be able to trigger an oops (on
some arches) with a badly crafted packet.

It looks like the amount of memory the 'receive_queue' and 'send_queue'
can use is still unbounded ?!? I suggest to drop at least a commend
about why that should not be an issue (if that is really not a problem
;)

I had also an additional minor comment on patch 7/8, see there for the
details...

Cheers,

Paolo