[Libguestfs] Thoughts on nbdkit automatic reconnection

Thu Sep 19 11:42:25 UTC 2019

I have an update on the networking issue:
- After the deep dive into the logs of the firewall by customer's security
team, it turns out that even though there were some disconnections, the
time-stamps do not match.
 This means that we got the disconnected by something else (ESXi or
conversion host perhaps)
- As we mentioned in the chat briefly, there could be general keep-alive
issues on both RHEL (conversion host) and ESXi side.
 We changed the keep-alive settings in RHEL, but could not find the
equvalent in VMware as of yet.
- I found on a few spots that there are some vddk (vixDiskLib.nfc*)
settings which can configure NFC keep-alives and timeouts, but I do not
understand it deeply enough to see if anything would help.

Whatever may be the cause, a retry filter would most likely solve the
problem.

Since we are fairly certain that we would encounter another failure with
VDDK how the situation stands now, we are trying SSH transport to see how
that will go.

Cheers,

*Nenad Perić*

PRINCIPAL SOFTWARE ENGINEER

Red Hat - Migration Engineering

nenad at redhat.com

<http://redhat.com>

On Thu, Sep 19, 2019 at 11:50 AM Richard W.M. Jones <rjones at redhat.com>
wrote:

> On Wed, Sep 18, 2019 at 01:59:01PM +0100, Richard W.M. Jones wrote:
> > We have a running problem with the nbdkit VDDK plugin where the VDDK
> > side apparently disconnects or the network connection is interrupted.
> > During a virt-v2v conversion this causes the entire operation to fail,
> > and since v2v conversions take many hours that's not a happy outcome.
> >
> > (Aside: I should say that we see many cases where it's claimed that
> > the connection was dropped, but often when we examine them in detail
> > the cause is something else.  But it seems like this disconnection
> > thing does happen sometimes.)
>
> It turns out in the customer case that led us to talk about this, a
> Checkpoint firewall was forcing the VDDK control connection to be
> closed after an idle period.  (The VDDK connection as a whole was not
> actually idle because data was being copied over the separate data
> port, but the firewall did not associate the two ports).  I believe
> nbdkit-retry-filter would have helped in this case because reopening
> the VDDK connection will reestablish the control/metadata connection,
> and therefore I am looking at an implementation now.
>
> Rich.
>
> --
> Richard Jones, Virtualization Group, Red Hat
> http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://people.redhat.com/~rjones/virt-top
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190919/8c4e6e7e/attachment.htm>