[vfio-users] qemu stuck when hot-add memory to a virtual machine with a device passthrough
Wuzongyong (Euler Dept)
cordius.wu at huawei.com
Fri Apr 20 03:11:12 UTC 2018
> > > > Hi,
> > > >
> > > > The qemu process will stuck when hot-add large size memory to
> > > > the virtual machine with a device passtrhough.
> > > > We found it is too slow to pin and map pages in vfio_dma_do_map.
> > > > Is there any method to improve this process?
> > >
> > > At what size do you start to see problems? The time to map a
> > > section of memory should be directly proportional to the size. As
> > > the size is increased, it will take longer, but I don't know why
> > > you'd reach a point of not making forward progress. Is it actually
> > > stuck or is it just taking longer than you want? Using hugepages
> > > can certainly help, we still need to pin each PAGE_SIZE page within
> > > the hugepage, but we'll have larger contiguous regions and therefore
> > > call iommu_map() less frequently. Please share more data. Thanks,
> > >
> > > Alex
> > It just take longer time, instead of actually stuck.
> > We found that the problem exist when we hot-added 16G memory. And it
> > will consume tens of minutes when we hot-added 1T memory.
>
> Is the stall adding 1TB roughly 64 times the stall adding 16GB or do we
> have some inflection in the size vs time curve? There is a cost to
> pinning an mapping through the IOMMU, perhaps we can improve that, but I
> don't see how we can eliminate it or how it wouldn't be at least linear
> compared to the size of memory added without moving to a page request
> model, which hardly any hardware currently supports. A workaround might
> be to incrementally add memory in smaller chunks which generate a less
> noticeable stall. Thanks,
>
> Alex
It took about 1 minute to add 16GB and about 40 minutes to add 1TB.
More information about the vfio-users
mailing list