[vfio-users] cudaErrorDevicesUnavailalbe using Cuda in KVM using VFIO device passthrough

Tue May 1 19:30:58 UTC 2018

Alex,

Thank you for your reply and all of your ideas.  You are right that the SXM uses NVLink - I had not thought of that as a potential culprit.  I do not have any PCIe GPUs in this cluster, but I may be able to setup a standalone test on an older box.

I have not seen a specific mention from NVIDIA regarding VFIO support for this form factor of the Tesla V100, but there were talks at GTC regarding using Tesla cards with VFIO.

Do you know of a better guide you could point me to for getting up and running with VFIO?  I was thinking that it felt like a permissions issue (as I can query the device, but not write to it), so it could be an issue with how it had me set up the ACLs...

That is great to know of that the CentOS 7.4 kernel does support virqfd - from what I had read, it seemed like it should be there, but decided to try upgrading the kernel anyway just in case that was the issue.

Thank you,
Andy

From: Alex Williamson [mailto:alex.williamson at redhat.com]
Sent: Tuesday, May 1, 2018 10:38 AM
To: Andrew Zimmerman <atz at rincon.com>
Cc: 'vfio-users at redhat.com' <vfio-users at redhat.com>
Subject: Re: [vfio-users] cudaErrorDevicesUnavailalbe using Cuda in KVM using VFIO device passthrough

On Tue, 1 May 2018 00:37:36 +0000
Andrew Zimmerman <atz at rincon.com<mailto:atz at rincon.com>> wrote:

> I have a system with 4 Tesla V100-SXM-16GB GPUs in it, and I am

SXM is the NVLink variant, right? VFIO has no special knowledge or
handling of NVLink and I'd only consider it supported so far as it
behaves similarly to PCIe. A particular concern of NVLink is how the
mesh nature of the interconnect makes use of and enforces IOMMU-based
translations necessary for device isolation and assignment, but we
can't know this because it's proprietary. Does NVIDIA claim that VFIO
device assignment is supported for these GPUs?

> attempting to pass these devices through to virtual machines run by
> KVM. I am managing the VMs with OpenNebula and I have followed the
> instructions at
> https://docs.opennebula.org/5.4/deployment/open_cloud_host_setup/pci_passthrough.html<https://docs.opennebula.org/5.4/deployment/open_cloud_host_setup/pci_passthrough.html>
> to pass the device through to my VM. I am able to see the device in
> nvidia-smi, watch its power/temperature levels, change the
> persistence mode and compute mode, etc.

Ugh, official documentation that recommends the vfio-bind script and
manually modifying libvirt's ACL list. I'd be suspicious of any device
assignment support making use of those sorts of instructions.

> I can query the device to get properties and capabilities, but when I
> try to run a program on it that utilizes the device (beyond
> querying), I receive an error message about the device being
> unavailable. To test, I am using simpleAtopmicIntrinsics out of the
> CUDA Samples. Here is the output I receive:
>
> SimpleAtomicIntrinsics starting...
>
> GPU Device 0: "Tesla V100-SXM2-16GB": with compute capability 7.0
>
> > GPU device has 80 Multi-Processors, SM 7.0 compute capabilities
>
> Cuda error at simpleAtomicIntrinsics.cu:108
> code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **) &dOData,
> memsize)"
>
> I have tried this with multiple devices (in case there was an issue
> with vfio on the first device) and had the same result on each of
> them.

Have you tried with a PCIe Tesla?

> The host OS is CentOS 7.4.1708. I upgraded the kernel to 4.15.15-1
> from the elrepo to ensure that I had support for vfio_virqfd. I am
> running the NVIDIA 390.15 driver and using cuda 9.1
> (cuda-9-1-9.1.85-1.x86_64 rpm).

vfio_virqfd is just an artifact of OpenNebula's apparent terrible
handling of device assignment. virqfd is there in a RHEL/Centos 7.4
kernel, but it may not be a separate module and it's not necessary to
load it via dracut as indicated in their guide, only to blacklist
nouveau.

> Does anyone have ideas on what could be causing this or what I could
> try next?

I think you're in uncharted territory with NVLink based GPUs and not
quite standard device assignment support in your chosen distro. I'd
start with testing whether the test program works with PCIe GPUs to
eliminate the interconnect issue. Thanks,

Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20180501/5e1ce17a/attachment.htm>