[rhelv6-list] A kernel bug that causes a system crash when the uptime is longer than 208.5 days

Tue Jan 24 17:54:13 UTC 2012

I know I can change clock source by pushing value into /sys/devices/system/clocksource/clocksource0/current_clocksource, i don't believe I can push "notsc" online (dont see it as an option). I guess grub is my only way togo.

-----Original Message-----
From: rhelv6-list-bounces at redhat.com [mailto:rhelv6-list-bounces at redhat.com] On Behalf Of Musayev, Ilya
Sent: Tuesday, January 24, 2012 12:41 PM
To: Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list
Subject: Re: [rhelv6-list] A kernel bug that causes a system crash when the uptime is longer than 208.5 days
Importance: High

Akemi,

Which kernels are affected?

I'm about to go large on latest 6.2 kernel and curious if I need to wait until this bug is resolved. I also see that for non-vmware servers I can use "notsc", can this be done online?

-----Original Message-----
From: rhelv6-list-bounces at redhat.com [mailto:rhelv6-list-bounces at redhat.com] On Behalf Of Akemi Yagi
Sent: Tuesday, January 24, 2012 4:03 AM
To: Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list
Subject: Re: [rhelv6-list] A kernel bug that causes a system crash when the uptime is longer than 208.5 days

On Fri, Jan 6, 2012 at 8:55 AM, Akemi Yagi <amyagi at gmail.com> wrote:
> On Fri, Jan 6, 2012 at 8:55 AM, Robin Price II <rprice at redhat.com> wrote:
>> Bugzilla:  https://bugzilla.redhat.com/show_bug.cgi?id=765720
>>
>> This is private due to private information from customer use cases. 
>> If you need further details, I would highly encourage you to contact 
>> Red Hat support or your TAM.
>>
>> Here is the initial information opened in the BZ:
>>
>> "The following patch is in urgent fix for Linus branch, which avoid 
>> the unnecessary overflow in sched_clock otherwise kernel will crash 
>> after
>> 209~250 days.
>>
>> http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=patch;h=4cecf
>> 6d401a01d054afc1e5f605bcbfe553cb9b9
>>
>> In hundreds of days, the __cycles_2_ns calculation in sched_clock has 
>> an overflow.  cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing the 
>> final value to become zero.  We can solve this without losing any precision.
>> We can decompose TSC into quotient and remainder of division by the 
>> scale factor, and then use this to convert TSC into nanoseconds."
>>
>> ~rp
>
> Thank you for this post to let us know that Red Hat is now taking care 
> of this issue.

Just a note to add that there is a KB article for this issue:

https://access.redhat.com/kb/docs/DOC-69254
"sched_clock() overflow after 208.5 days in Linux Kernel"

Akemi

_______________________________________________
rhelv6-list mailing list
rhelv6-list at redhat.com
https://www.redhat.com/mailman/listinfo/rhelv6-list

_______________________________________________
rhelv6-list mailing list
rhelv6-list at redhat.com
https://www.redhat.com/mailman/listinfo/rhelv6-list