[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [vfio-users] vfio, xeon e3s, acs, & gpus -- oh my!



On Fri, 31 Mar 2017 12:43:56 -0700
Joshua Hoblitt <josh hoblitt com> wrote:

> Hi Folks,
> 
> Long time listener (~2days), first time...
> 
> Based on years of pain-free experience with VF NICs, I naively thought
> that installing a second GPU in a desktop for VM pass through would be a
> straight forward task.  This has turned out to be an incorrect
> assumption.  I have two desktop systems with [unfortunately] the same
> basic configuration:
> 
> * supermicro x10sae
> * "Intel(R) Xeon(R) CPU E3-1276 v3 @ 3.60GHz"
> * Fedora 25 / linux 4.10
> 
> In one of these systems I have installed 2x nvidia gtx 1070 GPUs:
> 
> $ uname -r
> 4.10.5-200.fc25.x86_64
> $ lspci -nn | grep -i vga
> 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104
> [GeForce GTX 1070] [10de:1b81] (rev a1)
> 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104
> [GeForce GTX 1070] [10de:1b81] (rev a1)
> $ find /sys/kernel/iommu_groups/ -type l | egrep '0[12]:00'
> /sys/kernel/iommu_groups/1/devices/0000:02:00.1
> /sys/kernel/iommu_groups/1/devices/0000:01:00.1
> /sys/kernel/iommu_groups/1/devices/0000:02:00.0
> /sys/kernel/iommu_groups/1/devices/0000:01:00.0
> 
> and both GPUs are landing in the same vfio group. :(  Digging into the
> motherboard manual shows that there is some sort of PCIe switch between
> the physical PCIe slots and the CPU's root port:
> 
>     pg 16 - http://www.supermicro.com/manuals/motherboard/C226/MNL-1479.pdf
> 
> I have been completely unable to find a datasheet for the ASM1480
> "switch".  The homepage isn't helpful:
> 
>     http://www.asmedia.com.tw/eng/e_show_products.php?item=109
> 
> I briefly explored the ACS override patch but the second quirks.c hunk
> does not apply to vanilla linux 4.9/4.10 sources. I am rather hesitant
> to try to forward port as I know nothing about the pci subsystem and the
> file has a changed a lot since the patch epoc.  I took it as a bad sign
> that google can't find a forward port of the patch for current kernels. 
> However, Google did eventually turn up a BZ post from Alex Williamson
> stating that Xeon E3s, at least as of "v3", do not support ACS:
> 
>     https://bugzilla.redhat.com/show_bug.cgi?id=1113399#c9
>    
> https://www-ssl.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e3-1200v3-spec-update.pdf
> 
> So it sounds like I'm looking at new hardware to get this working with
> stock Fedora kernels?
> 
> I prefer E3s for a desktop CPU as they have ECC memory support. It took
> forever to dig through Intel's website to find basic CPU datasheets. 
> The E3-1200 v5 datasheets don't seem to mention ACS and the E3-1200 v6
> datasheets (current product generation) haven't been published yet (!).
> 
>    
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-technical-resources.html
> 
> Does anyone know definitely if the E3-1200 v6 generation supports ACS in
> the on-die PCIe controller?
> 
> My guess is this is not the case or it would be called out as a "new
> security feature" in the marketing material.  However, I am still hoping
> to stay with an E3-1200 CPU in order to keep costs low.  It appears that
> the supermicro x11sat-f motherboard has a "PLX8747" switch between the
> CPU root port and the physical slots:
> 
>     pg 18 - http://www.supermicro.com/manuals/motherboard/H4/MNL-1823.pdf
> 
> Which I think may be a broadcom "pex 8747":
> 
>    
> https://www.broadcom.com/products/pcie-switches-bridges/pcie-switches/pex8747
> 
> which does claim ACS support in the overview sheet. Does anyone know if
> ACS is indeed working with the "pex 8747" and/or the sm x1?sat series of
> motherboards?  Alternatively, does anyone know of an E3-1200 motherboard
> that does have working ACS with 2x [8x/16x] PCIe slots?

The end-to-end topology needs to support ACS, so it really does not
matter if a switch between the endpoint and the root port supports ACS
if the upstream root port does not.  Packets could be rerouted at the
root port regardless of how much isolation exists downstream of it.

> If anyone has followed my rambling this far, thank you for your extreme
> patience.  I would like to thank Alex Williamson for his vfio blog and
> the kvmforum talk up on youtube -- I would be completely lost at this
> point without that information.

Thanks!  I certainly discourage use of the ACS override patch, which
at least for my part is why you don't see forward ports of it.  AFAIK,
it's not possible to get separate IOMMU groups for the processor root
ports on i3/5/7/E3 otherwise though.  If you need to stick with this
product family, you'll need to figure out how to make a configuration
where the processor root ports are used exclusively by either the host
or the guest.  We do have ACS equivalent isolation in the PCH root
ports, but of course that moves you further from the CPU and might
restrict full bandwidth.  Otherwise you need a Xeon E5 or High End
Desktop processor.  As I document in my setup, I use IGD for host
graphics and can therefore dedicate the processor root port to the
guest graphics.  I can use the PCH root ports for a 3rd guest if
necessary.

If you're not stuck with Intel and you're not in a hurry, you may want
to see how the Ryzen ACS situation works out.  There's nothing I'm
aware of currently on the market that allows enabling ACS in Ryzen, but
keep watching.  Thanks,

Alex


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]