[Libguestfs] [PATCH libnbd] ublk: Add new nbdublk program

Ming Lei ming.lei at redhat.com
Wed Aug 31 09:29:13 UTC 2022


On Wed, Aug 31, 2022 at 10:13:08AM +0100, Richard W.M. Jones wrote:
> On Wed, Aug 31, 2022 at 09:45:45AM +0800, Ming Lei wrote:
> > On Tue, Aug 30, 2022 at 05:13:46PM +0100, Richard W.M. Jones wrote:
> > > On Tue, Aug 30, 2022 at 11:29:26PM +0800, Ming Lei wrote:
> > > > On Tue, Aug 30, 2022 at 03:38:50PM +0100, Richard W.M. Jones wrote:
> > > > > On Tue, Aug 30, 2022 at 03:12:23PM +0800, Ming Lei wrote:
> > > > > > The patch sent in last email may cause io hang on MQ, and follows the fixed
> > > > > > version:
> > > > > 
> > > > > I split this into two commits and cleaned them up and posted them here:
> > > > > 
> > > > > https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/
> > > > > 
> > > > > Unfortunately this doesn't work for me.  When I do various filesystem
> > > > > operations like git clone and a compile I see some subtle disk errors
> > > > > and eventually it deadlocks, so I guess there is some problem.
> > > > 
> > > > OK, care to provide more details about the reproducer? Like how backend
> > > > is setup, MQ/SQ is used, disk size, ...
> > > 
> > > My test script is attached.  $1 == "ublk".
> > > 
> > > It basically just clones a Linux repo and compiles it.  It hangs
> > > either during the clone or early in the build, and there are various
> > > "scary messages" from git which might indicate disk corruption.
> > > 
> > > The NBD server is:
> > > 
> > >   nbdkit -f memory 24G
> > > 
> > > running on the hypervisor ("nbd://pick").
> > > 
> > > > I have cloned linux kernel source tree on nbdublk disk and built it with
> > > > fedora 36 config for ~20min, so far so good. In my setting, backend is
> > > > 'nbdkit file /dev/sda(virtio-scsi)', nbdublk is single queue.
> > > 
> > > Can you see if you can reproduce a hang with the source from:
> > > 
> > >   https://gitlab.com/rwmjones/libnbd/-/commits/nbdublk/
> > > 
> > > I may have made a mistake when rebasing your patch or fixing it up to
> > > remove compiler warnings.
> > 
> > My test used the your tree directly. And I compared with it with
> > my native tree, basically same.
> > 
> > Today I will setup & run the test by your approach.
> 
> I tried it again now and it definitely deadlocks under load.

I can reproduce it, please try the top patch in aio branch, which fixed
hang in my reproducer with your test setting.

https://github.com/ming1/ubdsrv/commits/aio


Thanks,
Ming


More information about the Libguestfs mailing list