[vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts kernel

Okky Hendriansyah okky at nostratech.com
Sat Oct 24 01:28:14 UTC 2015


What kind of lockup do you mean? I'm on an ASRock Z87 Extreme6 board and I was using your linux-lts-vfio, never had any lockups. I did install intel microcode though.

But then I tried to use the ABS approach with linux-lts and apply the patches from your PKGBUILD. Currently I'm using linux-lts with i915 and ACS patches compiled using ABS and the host system is quote stable.

I noticed there're some diff lines between your linux config and the one from linux-lts, have you tried to use the config from official linux-lts?

Best regards,
Okky Hendriansyah

> On Oct 24, 2015, at 07:50, Dan Ziemba <zman0900 at gmail.com> wrote:
> 
> I just released the 4.1.11 PKGBUILD.  So far so good for me, but it's
> only been running for a few hours - not really long enough to tell.  
> 
> I do have ASRock too, but it is on nearly the latest uefi firmware.
>  There is one newer version, but it says the only change is the servers
> used for online update.
> 
> I never got around to setting up the intel microcode updates, so that
> should probably be my next step.
> 
> Dan
> 
> -----Original Message-----
> From: Mark Weiman <mark.weiman at markzz.com>
> To: vfio-users at redhat.com
> Subject: Re: [vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts
> kernel
> Date: Fri, 23 Oct 2015 18:56:39 -0400
> 
> To be honest, ASRock BIOS upgrades are fairly painless because they can
> be done outside of the operating system, so no need to get an image of
> FreeDOS ready.  If you do not want to get that though, I do still
> recommend the intel-ucode package if you don't already.  As of right
> now, I have no issues running my repository's 4.1.11-1 package.
> 
> Mark Weiman
> 
>> On Fri, 2015-10-23 at 16:51 -0200, Lucas Kückelhaus wrote:
>> One thing I noticed is that we all do seem to have ASROCK
>> motherboards 
>> as Mark mentioned. I am hesitant to perform a bios upgrade, however. 
>> VT-D is finicky enough as is. I can try 4.1.11 later tonight and see
>> if 
>> it helps.
>> 
>> Regards,
>> Lucas Kückelhaus
>> 
>>> On 2015-10-23 15:54, Dan Ziemba wrote:
>>> Well, old systemd and dbus didn't help. System was locked up again
>>> this morning.  Left the screen on tailing dmesg, but there was
>>> nothing
>>> interesting output.  I've got a PKGBUILD for 4.1.11 coming later
>>> today, so maybe that will help.
>>> 
>>> Dan
>>>> On Oct 22, 2015 10:53 PM, "Dan Ziemba" <zman0900 at gmail.com> wrote:
>>>> 
>>>> Hey,
>>>> 
>>>> I maintain that PKGBUILD. I think I've been having the same
>>>> problem,
>>>> but it seems to also happen if I reinstall the older linux-vfio
>>>> 4.1.6.
>>>> Here's the latest stack trace I was able to capture:
>>>> https://i.imgur.co [1]
>>>> m/FZkj4ib.jpg I had to disable the screen timeout so it would
>>>> stay
>>>> on
>>>> all night with dmesg tailing and I found it like this in the
>>>> morning.
>>>> Mouse and caps lock still worked, but I couldn't actually do
>>>> anything
>>>> and the clock was frozen.
>>>> 
>>>> I was also noticing that booting my system was unreliable. If I
>>>> would
>>>> reboot several times in a row, once every two to three time, it
>>>> would
>>>> hang while starting various services and then never start gdm.
>>>> 
>>>> Today I tried downgrading systemd and dbus to just before the
>>>> change
>>>> that switched to user buses (See here:
>>>> https://www.archlinux.org/news/d
>>>> -bus-now-launches-user-buses/ ;) I reboot a whole bunch of times
>>>> using
>>>> 4.1.10 linux-vfio-lts and it seems reliable. I have been using
>>>> the
>>>> computer pretty much all day for work and it hasn't had any of
>>>> the
>>>> soft
>>>> lockup yet, but it may be too soon to tell. Most of the time in
>>>> the
>>>> past the lockup would happen while idle.
>>>> 
>>>> These are the downgrades I made, everything else is up to date as
>>>> of
>>>> this morning.
>>>> 
>>>> [2015-10-22 12:22] [ALPM] transaction started
>>>> [2015-10-22 12:22] [ALPM] downgraded libsystemd (227-1 -> 225-1)
>>>> [2015-10-22 12:22] [ALPM] downgraded libdbus (1.10.0-4 -> 1.10.0-
>>>> 2)
>>>> [2015-10-22 12:22] [ALPM] downgraded dbus (1.10.0-4 -> 1.10.0-2)
>>>> [2015-10-22 12:22] [ALPM] downgraded systemd (227-1 -> 225-1)
>>>> [2015-10-22 12:22] [ALPM] downgraded lib32-systemd (227-1 -> 225-
>>>> 1)
>>>> [2015-10-22 12:22] [ALPM] downgraded systemd-sysvcompat (227-1 ->
>>>> 225-1)
>>>> [2015-10-22 12:22] [ALPM] transaction completed
>>>> 
>>>> I will follow up tomorrow with whether or not it locks up
>>>> tonight.
>>>> If
>>>> we can isolate the problem to systemd or dbus, maybe that's at
>>>> least
>>>> good enough for a bug report.
>>>> 
>>>> Dan
>>>> 
>>>> -----Original Message-----
>>>> From: Lucas Kückelhaus <lucas at kuckelhaus.com>
>>>> To: vfio-users at redhat.com
>>>> Subject: [vfio-users] Soft lockup on archlinux 4.1.10-1-vfio-lts
>>>> kernel
>>>> Date: Thu, 22 Oct 2015 23:00:37 -0200
>>>> Mailer: Roundcube Webmail/1.0.2
>>>> 
>>>> Hi,
>>>> 
>>>> I'm trying to run an Archlinux host on kernel 4.1.10-1-vfio-lts
>>>> (Mark
>>>> Weiman's custom repo) because I'm unable to boot a GPU-assigned
>>>> VM
>>>> on
>>>> 4.2.3-1-vfio.
>>>> 
>>>> The VM boots fine and works for a while, but the computer
>>>> sporadically
>>>> crashes with the following:
>>>> 
>>>> Oct 22 21:43:37 kvmhost kernel: NMI watchdog: BUG: soft lockup -
>>>> CPU#4
>>>> stuck for 22s! [swapper/4:0]
>>>> Oct 22 21:43:39 kvmhost kernel: Modules linked in: veth vhost_net
>>>> vhost
>>>> macvtap macvlan tun bridge stp llc nls_iso8859_1 nls_cp437 vfat
>>>> fat
>>>> iTCO_wdt iTCO_vendor_support nouveau snd_hda_codec_hdmi
>>>> intel_rapl
>>>> iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp mxm_wmi
>>>> snd_hda_
>>>> Oct 22 21:43:39 kvmhost kernel: sch_fq_codel fuse nfsd nfs
>>>> auth_rpcgss
>>>> oid_registry nfs_acl lockd grace sunrpc fscache ip_tables
>>>> x_tables
>>>> ext4
>>>> crc16 mbcache jbd2 dm_mod hid_logitech_hidpp hid_logitech_dj
>>>> hid_generic
>>>> usbhid hid sd_mod uas usb_storage atkbd libps2 crc32c_intel ah
>>>> Oct 22 21:43:39 kvmhost kernel: CPU: 4 PID: 0 Comm: swapper/4
>>>> Tainted: G
>>>> L 4.1.10-1-vfio-lts #1
>>>> Oct 22 21:43:39 kvmhost kernel: Hardware name: To Be Filled By
>>>> O.E.M. To
>>>> Be Filled By O.E.M./Z77 Extreme4, BIOS P2.30 09/21/2012
>>>> Oct 22 21:43:39 kvmhost kernel: task: ffff88080b119460 ti:
>>>> ffff88080b124000 task.ti: ffff88080b124000
>>>> Oct 22 21:43:39 kvmhost kernel: RIP: 0010:[<ffffffff810f6770>]
>>>> [<ffffffff810f6770>] try_to_del_timer_sync+0x0/0xa0
>>>> Oct 22 21:43:39 kvmhost kernel: RSP: 0018:ffff88082f303db0
>>>> EFLAGS:
>>>> 00000286
>>>> Oct 22 21:43:39 kvmhost kernel: RAX: 00000000ffffffff RBX:
>>>> 0000000000000286 RCX: 0000000000000000
>>>> Oct 22 21:43:39 kvmhost kernel: RDX: 00000000000000bf RSI:
>>>> 0000000000000286 RDI: ffff880270fa8428
>>>> Oct 22 21:43:39 kvmhost kernel: RBP: ffff88082f303dc8 R08:
>>>> 0000000000002710 R09: ffff88082f30e780
>>>> Oct 22 21:43:39 kvmhost kernel: R10: 0000000000000000 R11:
>>>> 0000000000000004 R12: ffff88082f303d28
>>>> Oct 22 21:43:39 kvmhost kernel: R13: ffffffff815f13de R14:
>>>> ffff88082f303dc8 R15: ffff880270fa8428
>>>> Oct 22 21:43:39 kvmhost kernel: FS: 0000000000000000(0000)
>>>> GS:ffff88082f300000(0000) knlGS:0000000000000000
>>>> Oct 22 21:43:39 kvmhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
>>>> 0000000080050033
>>>> Oct 22 21:43:39 kvmhost kernel: CR2: 00007fc2d6f6da28 CR3:
>>>> 000000029c65c000 CR4: 00000000001426e0
>>>> Oct 22 21:43:39 kvmhost kernel: Stack:
>>>> Oct 22 21:43:39 kvmhost kernel: ffffffff810f6872 ffff88082f303e38
>>>> ffff880270fa8390 ffff88082f303df8
>>>> Oct 22 21:43:39 kvmhost kernel: ffffffff8152a16f ffff880270fa8390
>>>> ffff8805b3bab800 ffff880270d20000
>>>> Oct 22 21:43:39 kvmhost kernel: 0000000000000001 ffff88082f303e38
>>>> ffffffff8152a3e7 ffff88082f3107e0
>>>> Oct 22 21:43:39 kvmhost kernel: Call Trace:
>>>> Oct 22 21:43:39 kvmhost kernel: <IRQ>
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6872>] ?
>>>> del_timer_sync+0x62/0x70
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a16f>]
>>>> inet_csk_reqsk_queue_drop+0xbf/0x240
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a3e7>]
>>>> reqsk_timer_handler+0xf7/0x2e0
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
>>>> inet_csk_reqsk_queue_drop+0x240/0x240
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f64c8>]
>>>> call_timer_fn+0x48/0x160
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff8152a2f0>] ?
>>>> inet_csk_reqsk_queue_drop+0x240/0x240
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810f6bd4>]
>>>> run_timer_softirq+0x284/0x330
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086711>]
>>>> __do_softirq+0xf1/0x2e0
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81086acd>]
>>>> irq_exit+0xbd/0xc0
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f31d5>]
>>>> smp_apic_timer_interrupt+0x55/0x70
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff815f13de>]
>>>> apic_timer_interrupt+0x6e/0x80
>>>> Oct 22 21:43:39 kvmhost kernel: <EOI>
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81021c1d>] ?
>>>> native_sched_clock+0x2d/0xa0
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c81>] ?
>>>> cpuidle_enter_state+0xa1/0x250
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490c53>] ?
>>>> cpuidle_enter_state+0x73/0x250
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81490e8a>]
>>>> cpuidle_enter+0x2a/0x30
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff810cb36c>]
>>>> cpu_startup_entry+0x32c/0x460
>>>> Oct 22 21:43:39 kvmhost kernel: [<ffffffff81055f7e>]
>>>> start_secondary+0x19e/0x1e0
>>>> Oct 22 21:43:39 kvmhost kernel: Code: 4d d8 65 48 33 0c 25 28 00
>>>> 00
>>>> 00
>>>> 44 89 e0 75 0b 48 83 c4 18 5b 41 5c 41 5d 5d c3 e8 1b b8 f8 ff 90
>>>> 66 2e
>>>> 0f 1f 84 00 00 00 00 00 <0f> 1f 44 00 00 55 48 89 e5 41 54 53 48
>>>> 81
>>>> ec
>>>> 30 10 00 00 48 83
>>>> 
>>>> This happens for all cores and it locks up the entire system. I
>>>> don't
>>>> know what to do. On 4.2.3-1-vfio I have no hangups and all my
>>>> non-vfio
>>>> VMs work perfectly fine.
>>>> 
>>>> Thank you,
>>>> Lucas Kückelhaus
>>>> 
>>>> _______________________________________________
>>>> vfio-users mailing list
>>>> vfio-users at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/vfio-users [2]
>>> 
>>> 
>>> Links:
>>> ------
>>> [1] https://i.imgur.co
>>> [2] https://www.redhat.com/mailman/listinfo/vfio-users
>> 
>> _______________________________________________
>> vfio-users mailing list
>> vfio-users at redhat.com
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users
> _______________________________________________
> vfio-users mailing list
> vfio-users at redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users




More information about the vfio-users mailing list