[vfio-users] PLX switch report a UR when EP tries to DMA to VM's memory
Alex Williamson
alex.williamson at redhat.com
Fri Sep 14 02:55:39 UTC 2018
On Fri, 14 Sep 2018 01:25:20 +0000
"Wuzongyong (Euler Dept)" <cordius.wu at huawei.com> wrote:
> > > Hi,
> > >
> > > I notice a problem with a PCIe endpoint, which is behind a PLX switch,
> > assigned to a VM by VFIO.
> > > The problem is switch report a UR error when the EP tries to DMA to a
> > memory zone inside VM's address space.
> > > Assume that the DMA destination address is between in the VM's ram
> > > address space, and unfortunately that address value in host's point of
> > view just hit the PLX switch upstream port BAR0 memory-mapped IO range.
> > > In a result, the DMA will failed because SW think this memory request
> > > is invalid if the destination address hit its UP's bar.
> > > Is this a hardware bug or qemu/seabios doesn't maintain a proper address
> > space for VM?
> >
> > Upstream switch ports are generally single function devices and therefore
> > governed by 6.12.1.3 (PCIe base spec rev 4.0, v1) which indicates an ACS
> > capability must not be implemented. We can therefore read into section
> > 6.12.2 on interoperability which indicates the interaction between ACS and
> > non-ACS components, including:
> >
> > * When ACS P2P Request Redirect, ACS P2P Completion Redirect, or both
> > are being used, certain components in the PCI Express hierarchy must
> > support ACS Upstream Forwarding (of Upstream redirected Requests).
> > Specifically:
> > ...
> > Between each ACS component where P2P TLP redirection is enabled and
> > its associated Root Port, any intermediate Switches must support ACS
> > Upstream Forwarding. Otherwise, how such Switches handle Upstream
> > redirected TLPs is undefined.
> >
> > It's my interpretation therefore that in a configuration where the switch
> > downstream ports supports ACS, the switch upstream port must implicitly
> > support upstream forwarding, thus I would consider this a hardware issue.
> > The alternative is that we need to poke holes in the VM address space to
> > account for any possible conflict and assigned device hot-add becomes
> > nearly a non-starter. Thanks,
> >
> > Alex
>
> Thanks for your explanation.
> Do you know are there other vendors' switches that wouldn't result in this problem?
I've generally thought PLX switches were good, I know the NVIDIA GRID
cards made use of them and perhaps the multi-GPU Tesla cards still do.
PLX makes a large variety of switches though, so they could be a
different family, but also the nature of this issue can mask the
problem if the address spaces don't overlap in a way that triggers it.
Thanks,
Alex
More information about the vfio-users
mailing list