[dm-devel] Serial console is causing system lock-up

John Ogness john.ogness at linutronix.de
Thu Mar 7 10:37:53 UTC 2019


On 2019-03-07, Sergey Senozhatsky <sergey.senozhatsky.work at gmail.com> wrote:
>>>> When the console is constantly printing messages, I wouldn't say
>>>> that looks like a lock-up scenario. It looks like the system is
>>>> busy printing critical information to the console (which it is).
>>>
>>> What if we have N tasks/CPUs calling printk() simultaneously?
>> 
>> Then they take turns printing their messages to the console, spinning
>> until they get their turn. This still is not and does not look like a
>> lock-up. But I think you already know this, so I don't understand the
>> reasoning behind asking the question. Maybe you could clarify what
>> you are getting at.
>
> Sorry John, the reasoning is that I'm trying to understand
> why this does not look like soft or hard lock-up or RCU stall
> scenario.

The reason is that you are seeing data being printed on the console. The
watchdogs (soft, hard, rcu, nmi) are all touched with each emergency
message.

> The CPU which spins on prb_lock() can have preemption disabled and,
> additionally, can have local IRQs disabled, or be under RCU read
> side lock. If consoles are busy, then there are CPUs which printk()
> data and keep prb_lock contended; prb_lock() does not seem to be
> fair. What am I missing?

You are correct. Making prb_lock fair might be something we want to look
into. Perhaps also based on the loglevel of what needs to be
printed. (For example, KERN_ALERT always wins over KERN_CRIT.)

> You probably talk about the case when all
> printing CPUs are in preemptible contexts (assumingly this is what
> is happening in dm-integrity case) so they can spin on prb_lock(),
> that's OK. The case I'm talking about is - what if we have the same
> situation, but then one of the CPUs printk()-s from !preemptible.
> Does this make sense?

Yes, you are referring to a worst case. We could have local_irqs
disabled on every CPU while every CPU is hit with an NMI and all those
NMIs want to dump a load of messages. The rest of the system will be
frozen until those NMI printers can finish. But that is still not a
lock-up. At some point those printers should finish and eventually the
system should be able to resume.

John Ogness




More information about the dm-devel mailing list