[dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0

Andreas Hartmann andihartmann at freenet.de
Tue Jul 28 20:08:42 UTC 2015


On 07/28/2015 at 21:31 PM, Mike Snitzer wrote:
 > On Tue, Jul 28 2015 at  3:23pm -0400,
> Andreas Hartmann <andihartmann at freenet.de> wrote:
>
>> On 07/28/2015 at 08:58 PM, Mike Snitzer wrote:
>>> On Tue, Jul 28 2015 at  2:20pm -0400,
>>> Andreas Hartmann <andihartmann at freenet.de> wrote:
>>>
>>>> On 07/28/2015 at 07:50 PM, Mike Snitzer wrote:
>>>> [..]
>>>>> Are your SATA devcies using NCQ?
>>>>
>>>> Yes. It's enabled:
>>>>
>>>> dmesg| grep -i ncq
>>>> ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part
>>>> ata2.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>>>> ata3.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
>>>> ata1.00: 468862128 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
>>>>
>>>> As the errors already come up on boot (during mount of partitions or
>>>> even before the password for the disk has been provided): How can I
>>>> disable NCQ during boot of the kernel? Is there a kernel option?
>>>
>>> See:
>>> https://ata.wiki.kernel.org/index.php/Libata_FAQ#Enabling.2C_disabling_and_checking_NCQ
>>>
>>> alternatively, and likely easier, set this on the kernel commandline:
>>>   libata.force=noncq
>>
>> ata2.00: FORCE: horkage modified (noncq)
>> ata2.00: 5860533168 sectors, multi 0: LBA48 NCQ (not used)
>> ata3.00: FORCE: horkage modified (noncq)
>> ata3.00: 5860533168 sectors, multi 0: LBA48 NCQ (not used)
>> ata5.00: FORCE: horkage modified (noncq)
>> ata1.00: FORCE: horkage modified (noncq)
>> ata1.00: 468862128 sectors, multi 16: LBA48 NCQ (not used)
>>
>>
>> Perfectly. Seems to work w/ 3.19.8 and your mentioned patches. But now,
>> I'm getting another error, which I didn't see before w/ 3.x-kernels:
>>
>> [drm:btc_dpm_set_power_state [radeon]] *ERROR*
>> rv770_restrict_performance_levels_before_switch failed
>>
>> It seams that your patches do have some unwanted side effects :-).
>
> That is a completely different issue.  drm and radeon is a graphics
> issue.

Nothing changed on radeon code. I just applied your patches. Nothing 
more. Why should radeon been suddenly broken if I apply your patches to 
a stable 3.19.8 code?

These patches trigger tons of AMD-Vi IO_PAGE_FAULTs w/ ncq enabled and 
the IOMMU developers say, that it is not a problem of the iommu code.

>> Could you please reexamine your patch "dm crypt: don't allocate
>> pages for a partial request" - after applying this patch all the
>> problems are coming up here.
>
> More likely than not your hardware isn't very good.

Maybe - maybe not. The only thing I know for sure, is: with these 
patches applied, the machine doesn't work reliably any more. W/ ncq 
disabled, the AMD-Vi IO_PAGE_FAULTs are gone, but a radeon error never 
seen before came instead. Most probably chance. Most probably, it could 
have been risen any other error, too.

I am willing to do tests if you have any idea to be tested - I can 
reproduce it quite easily.


Thanks,
regards,
Andreas




More information about the dm-devel mailing list