[dm-devel] Serial console is causing system lock-up

John Ogness john.ogness at linutronix.de
Wed Mar 13 08:43:36 UTC 2019


On 2019-03-13, Sergey Senozhatsky <sergey.senozhatsky.work at gmail.com> wrote:
>>> The current printk implementation is handling all console printing
>>> as best effort. Trying hard enough to dramatically affect the
>>> system, but not trying hard enough to guarantee success.
>>
>> I agree that direct output is more reliable. It might be very useful
>> for debugging some types of problems. The question is
>> if it is worth the cost (code complexity, serializing CPUs
>> == slowing down the entire system).
>
> Agreed.
>
> I'm very skeptical about "serializing CPUs" part. It looks like one
> "print or die trying" is replaced with another "print or die trying".
> What happened to log_store() + flush_on_panic()?

We are literally discussing in a thread where the current printk
implementation failed to get messages out (lots of dropped messages)
_and_ printk console printing was responsible for _killing_ the machine.

What would my proposal do in _this_ situation:

1. If no emergency console was available or the messages were not
classified as emergency, messages would have been dropped during console
printing and the system would have run unaffected. The number of dropped
messages might not even be more if the scheduler could run the printk
kthread effectively.

OR

2. If an emergency console was available and the messages were
classified as emergency, _no_ messages would have dropped, the system
would have become very slow on the CPUs generating the messages, and
then eventually it would have recovered.


I don't understand how you can think "print or die trying" is replaced
with another "print or die trying". But it is probably not constructive
to debate this right now. Petr has laid out a good course that will
allow us to advance in smaller, more conservative, steps.

By the way, Sergey, I appreciate your skepticism.

John Ogness




More information about the dm-devel mailing list