[Virtio-fs] [PATCH 0/9] virtio-fs fixes

Liu Bo bo.liu at linux.alibaba.com
Thu Apr 25 18:10:08 UTC 2019


On Thu, Apr 25, 2019 at 10:59:50AM -0400, Vivek Goyal wrote:
> On Wed, Apr 24, 2019 at 04:12:59PM -0700, Liu Bo wrote:
> > Hi Vivek,
> > 
> > On Wed, Apr 24, 2019 at 02:41:30PM -0400, Vivek Goyal wrote:
> > > Hi Liubo,
> > > 
> > > I have made some fixes and took some of yours and pushed latest snapshot
> > > of my internal tree here.
> > > 
> > > https://github.com/rhvgoyal/linux/commits/virtio-fs-dev-5.1
> > > 
> > > Patches have been rebased to 5.1-rc5 kernel. I am thinking of updating
> > > this branch frequently with latest code.
> > 
> > With this branch, generic/476 still got hang, and yes, it's related to
> > "async page fault related events" just as what you've mentioned on #irc.
> > 
> > I confirmed this with kvm and kvmmmu tracepoints.
> > 
> > The tracepoints[1] showed that
> > [1]: https://paste.ubuntu.com/p/N9ngrthKCf/
> > 
> > ---
> > handle_ept_violation
> >   kvm_mmu_page_fault(error_code=182)
> >     tdp_page_fault
> >       fast_page_fault # spte not present
> >       try_async_pf #queue a async_pf work and return RETRY
> > 
> > vcpu_run
> >  kvm_check_async_pf_completion
> >    kvm_arch_async_page_ready
> >      tdp_page_fault(vcpu, work->gva, 0, true);
> >        fast_page_fault(error_code == 0);
> >        try_async_pf # found hpa
> >        __direct_map()
> > 	  set_spte(error_code == 0) # won't set the write bit
> > 
> > handle_ept_violation
> >   kvm_mmu_page_fault(error_code=1aa)
> >     tdp_page_fault
> >       fast_page_fault # spte present but no write bit
> >       try_async_pf # no hpa again queue a async_pf work and return RETRY
> 
> So why there is no "hpa"?
>

TBH, I have no idea, __gfn_to_pfn_memslot() did returned a pfn
successfully after async pf, but during its following EPT_VIOLATION,
__gfn_to_pfn_memslot() returned KVM_PFN_ERR_FAULT and indicated
callers to do another async pf, and over and over again.

> I was running a different test. I mmaped a file in guest, then truncated
> file to size 0 on host and then guest tried to read/write the mmaped
> region.
> 
> This will trigger async page fault on host. But given file size is zero,
> that page fault will not succeed.
>

I see, I checked the file I used on host, I could use FIEMAP to read
all its extent and it wasn't truncated.

> Current async pf logic has no notion of failure. It assumes it will always
> succeed. It does not even check the return code of
> get_user_pages_remote(), which can return error.
> 
> So there are few things to be done.
> 
> - Modify async pf logic so that it can it capture and report error.
> - If guest user space mmaped() file in question, then send SIGBUS to
>   process.
> - If guest kernel is trying to access memory which async pf can't
>   resolve, then create an escape path and return error to user
>   space. (something like memcpy_mcsafe() I think).
>

I need to think more about this, in my case, guest is just doing a
plain write(2) or writev(2), it shouldn't get into hang like that in
any case.

Thanks for sharing the code, will take a look.

thanks,
-liubo
> I was playing with this and made some progress. But that work is not
> complete. I thought of dealing with this problem later. If you are
> curious, I have pushed my unfinished code here.
> 
> Kernel:
> https://github.com/rhvgoyal/linux/commits/virtio-fs-dev-async-pf
> 
> Qemu:
> https://github.com/rhvgoyal/qemu/commits/virtio-fs-async-pf
> 
> Thanks
> Vivek




More information about the Virtio-fs mailing list