[Crash-utility] [PATCH] diskdump: Optimize the boot time
Huang Shijie
shijie at os.amperecomputing.com
Wed Mar 30 18:31:21 UTC 2022
Hi Kazu,
On Wed, Mar 30, 2022 at 09:27:18AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> -----Original Message-----
> > Hi Kazu,
> > On Wed, Mar 30, 2022 at 08:28:19AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> > > -----Original Message-----
> > > > 1.) The vmcore file maybe very big.
> > > >
> > > > For example, I have a vmcore file which is over 23G,
> > > > and the panic kernel had 767.6G memory,
> > > > its max_sect_len is 4468736.
> > > >
> > > > Current code costs too much time to do the following loop:
> > > > ..............................................
> > > > for (i = 1; i < max_sect_len + 1; i++) {
> > > > dd->valid_pages[i] = dd->valid_pages[i - 1];
> > > > for (j = 0; j < BITMAP_SECT_LEN; j++, pfn++)
> > > > if (page_is_dumpable(pfn))
> > > > dd->valid_pages[i]++;
> > > > ..............................................
> > > >
> > > > For my case, it costs about 56 seconds to finish the
> > > > big loop.
> > > >
> > > > This patch moves the hweightXX macros to defs.h,
> > > > and uses hweight64 to optimize the loop.
> > > >
> > > > For my vmcore, the loop only costs about one second now.
> > > >
> > > > 2.) Tests result:
> > > > # cat ./commands.txt
> > > > quit
> > > >
> > > > Before:
> > > >
> > > > #echo 3 > /proc/sys/vm/drop_caches;
> > > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
> > > > ............................
> > > > real 1m54.259s
> > > > user 1m12.494s
> > > > sys 0m3.857s
> > > > ............................
> > > >
> > > > After this patch:
> > > >
> > > > #echo 3 > /proc/sys/vm/drop_caches;
> > > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1
> > > > ............................
> > > > real 0m55.217s
> > > > user 0m15.114s
> > > > sys 0m3.560s
> > > > ............................
> > >
> > > Thank you for the improvement!
> > >
> > > as far as I tested on x86_64 it did not give such a big gain, but looking at
> > > the user time, it will do on arm64. Lianbo, can you reproduce on arm64?
> > >
> > > with a 192GB x86_64 dumpfile, slightly improved:
> > >
> > > $ time echo quit | ./crash vmlinux dump >/dev/null
> > >
> > > real 0m5.632s
> > Thanks for the testing.
> >
> > I am curious why it costs only 5.632s for a 192G dumpfile?
> > How much memory of the panic kernel in the dumpfile?
> >
> > My vmcore has 767.G memory, and the max_sect_len is 4468736.
>
> I got it with makedumpfile -d 0 and tested it without dropping caches
> to measure the change of the loop cost. As for memory, which size
> are you saying? That machine has 192GB memory.
>
> $ ls -lhs dump
> 193G -rw-------. 1 root root 193G Mar 30 17:07 dump
> $ file dump
> dump: Kdump compressed dump v6, system Linux, ...
>
> $ ./crash vmlinux dump
>
> MEMORY: 191.7 GB
>
> crash> help -D
> ...
> block_size: 4096
> sub_hdr_size: 10
> bitmap_blocks: 3088
> max_mapnr: 50593791
> ...
> total_valid_pages: 50178690
> max_sect_len: 12352 // added
Ok, it seems your max_sect_len is too small.
>
> The max_sect_len looks too small comparing yours.. but
> 12352 * 4096 = 50593792
My max_sect_len is 4468736, so
4468736 / 12352 = 361.78
The (4468736 * 4096) costs 56s on my machine.
Assume our CPU runs at the same speed,
your machine will costs (56/361.78 = 0.1547)s.
So you cannot get big gain. :)
Thanks
Huang Shijie
More information about the Crash-utility
mailing list