[Crash-utility] [PATCH] overflow calculating time in timer output after long uptime

Seymour, Shane M shane.seymour at hpe.com
Mon Feb 6 07:58:48 UTC 2017


Hi Dave,

I've been staring at an issue in crash all afternoon trying to work out what happened to the current clock value printed by timer -r. I've got output from the timer -r command that looks like (the vmcore has 315+ days of uptime) this:

CPU: 0  HRTIMER_CPU_BASE: ffff883f7fa0d7a0
  CLOCK: 0  HRTIMER_CLOCK_BASE: ffff883f7fa0d7e0  [ktime_get]
       CURRENT
   9017759896290448
     SOFTEXPIRES          EXPIRES           HRTIMER           FUNCTION
  27464498630000000  27464498630000000  ffff883f7fa0da40  ffffffff810d0440  <tick_sched_timer>
  27464498643441992  27464498643491992  ffff880243f5bd60  ffffffff8109ae70  <hrtimer_wakeup>
  27464498690255938  27464498690755937  ffff8842d7c1b908  ffffffff8109ae70  <hrtimer_wakeup>
...

The time under the CURRENT heading is way less than the softexpires/expires which really didn't look correct so I dug into the crash code and found that the calculation in dump_hrtimer_clock_base() can overflow for a largish value of current_time (I did the calculations involved in working out what current_time was by hand to make sure that they would overflow and they did, but I haven't included those workings). The value above works out to be the amount over the maximum value of an unsigned 64bit in that would have been calculated (i.e. it wrapped).

When I changed it from:

        now = current_time * 1000000000LL / machdep->hz + offset;

to 

        now = current_time * (1000000000LL / machdep->hz) + offset;

It works as I expected it to. I've tested with that change and the fixed output is:

CPU: 0  HRTIMER_CPU_BASE: ffff883f7fa0d7a0
  CLOCK: 0  HRTIMER_CLOCK_BASE: ffff883f7fa0d7e0  [ktime_get]
       CURRENT
  27464503970000000
     SOFTEXPIRES          EXPIRES           HRTIMER           FUNCTION
  27464498630000000  27464498630000000  ffff883f7fa0da40  ffffffff810d0440  <tick_sched_timer>
  27464498643441992  27464498643491992  ffff880243f5bd60  ffffffff8109ae70  <hrtimer_wakeup>
  27464498690255938  27464498690755937  ffff8842d7c1b908  ffffffff8109ae70  <hrtimer_wakeup>

There's a second one in dump_hrtimer_base():

        now = current_time * 1000000000LL / machdep->hz;

that may have similar issues but it wasn't impacting on the vmcore I was looking at but I changed it anyway.

Thanks
Shane

P.S. cut and pasted the diff output not sure if it's going to work correctly or not because of that.
---
--- a/kernel.c  2016-11-30 13:56:29.000000000 -0500
+++ b/kernel.c  2017-02-06 02:23:35.179164828 -0500
@@ -7263,7 +7263,7 @@ dump_hrtimer_clock_base(const void *hrti
        offset = 0;
        if (VALID_MEMBER(hrtimer_clock_base_offset))
                offset = ktime_to_ns(base + OFFSET(hrtimer_clock_base_offset));
-       now = current_time * 1000000000LL / machdep->hz + offset;
+       now = current_time * (1000000000LL / machdep->hz) + offset;

        dump_active_timers(base, now);
 }
@@ -7285,7 +7285,7 @@ dump_hrtimer_base(const void *hrtimer_ba

        /* get current time(uptime) */
        get_uptime(NULL, &current_time);
-       now = current_time * 1000000000LL / machdep->hz;
+       now = current_time * (1000000000LL / machdep->hz);

        dump_active_timers(base, now);
 }




More information about the Crash-utility mailing list