[vfio-users] VM doesn't boot, hangs with R9 Fury passthrough
Matti Niemenmaa
matti.niemenmaa+vfio at iki.fi
Sat Sep 5 19:20:31 UTC 2015
On 2015-09-02 20:57, Alex Williamson wrote:
> On Wed, Sep 2, 2015 at 11:22 AM, Matti Niemenmaa <
> matti.niemenmaa+vfio at iki.fi> wrote:
>> DMAR: ERROR: DMA PTE for vPFN 0xfea00 already set (to c7d9c3003 not
>> 383feea00083)
>>
>
> This means that we're trying to map an IOVA, expecting the existing page
> table entry to be zero, but it's not. There is tracing you can enable in
> QEMU, see docs/tracing.txt. I generally use the stderr backend and for
> this, tracing "trace_vfio_listener*" ought to show the mappings. There's
> also tracing on the kernel side that could make sure QEMU and kernel have
> the same mappings.
It's actually specified without the "trace_": "vfio_listener*" works.
I had missed the kernel tracing (the DMA dump following such DMAR errors
— I assume this is what you meant?) earlier, because coupled with the
hundreds of warning messages the log buffer ended up maxing out. I
bumped CONFIG_LOG_BUF_SHIFT way up to 24 now to make sure I don't miss
anything. I also enabled various other debugging options in the kernel
in the hopes of catching something, but to little avail.
I believe that CONFIG_DMA_API_DEBUG caught the following:
DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000000479000
The call trace points to libata as the culprit. Even though I don't
think it's related to the passthrough issues, I bumped
RADIX_TREE_MAX_TAGS from 3 to 11 (8 wasn't enough, i.e. even 255
overlapping mappings were exceeded) in the kernel source to get rid of
the issue. I hope this doesn't mean I'm papering over something more
fundamental, though.
As for the tracing, it mostly added to my confusion. For example, here's
another DMAR error:
DMAR: ERROR: DMA PTE for vPFN 0xfea00 already set (to fe43e9003 not
bea00083)
And here's an arbitrary line (the last one) from the kernel DMA mapping
dump that immediately follows the above:
radeon 0000:01:00.0: page idx 1023 P=968f0000 N=968f0 D=ff7fe000 L=1000
DMA_BIDIRECTIONAL dma map error checked
And here's an arbitrary line from the QEMU trace (minus timestamp):
vfio_listener_region_add_ram region_add [ram] 100000000 - 43fffffff
[0x7f5560000000]
E.g. looking at address 0xfea00, I can find it in several of those
region_add ranges, and it's in a live range when the DMAR error is
triggered, which seems expected. But I can't find it in any of the
ranges from the kernel's dump. Am I looking at it wrong? AFAICT the L
value is the length in decimal and the others are different kinds of
starting addresses. 0xfea00 does not fall in any range of length L
starting at any of the P, N, or D values. So where exactly is it
"already set"? 0xfe43e9003 doesn't fall in any of those ranges either
(and nor does 0xbea00083 but that seems expected), so I'm having trouble
understanding what I should look for and where.
Miscellaneous notes and findings, in no particular order:
* Whenever a DMAR error occurs, the values always seem to end in 3. To
me, odd numbers like that seem strange for page table entries.
* Unlike I previously thought, my messing around with the Windows boot
recovery-related settings doesn't seem to affect whether the VM gets as
far as the Windows 10 logo (and associated spinner). What matters is
that I boot once with "-vga std" all the way to the desktop and then do
a proper shutdown — if I don't do that first, the VM boots before the
Windows logo shows up. Perhaps this means that the boot recovery
settings screen is somehow problematic with passthrough.
* I disabled "above 4G decoding" in the host motherboard's UEFI settings
to see if it changes anything. The only difference I've noticed is that
the DMAR errors now always have a 32-bit value after the "not".
* There are lots of "SKIPPING" messages in the QEMU trace. It doesn't
seem like they're intrinsically problematic, though.
More information about the vfio-users
mailing list