[Crash-utility] [PATCH] ppc64: do page traversal if vmemmap_list not populated

lijiang lijiang at redhat.com
Wed Sep 20 02:21:18 UTC 2023


On Tue, Sep 19, 2023 at 2:23 PM Aditya Gupta <adityag at linux.ibm.com> wrote:

> Hello lijiang,
>
> On Mon, Sep 18, 2023 at 07:34:04PM +0800, lijiang wrote:
> > Hi, Aditya
> > Thank you for the patch.
> >
> > On Mon, Sep 11, 2023 at 8:00 PM <crash-utility-request at redhat.com>
> wrote:
> >
> > > ...
> > >
> > > Currently 'crash-tool' fails on vmcore collected on upstream kernel on
> > > PowerPC64 with the error:
> > >
> > >     crash: invalid kernel virtual address: 0  type: "first list entry
> > >
> > > Presently the address translation for vmemmap addresses is done using
> > > the vmemmap_list. But with the below commit in Linux, vmemmap_list can
> > > be empty, in case of Radix MMU on PowerPC64
> > >
> > >     368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a
> > >     different vmemmap handling function)
> > >
> > > In case vmemmap_list is empty, then it's head is NULL, which crash
> tries
> > > to access and fails due to accessing NULL.
> > >
> > > Instead of depending on 'vmemmap_list' for address translation for
> > > vmemmap addresses, do a kernel pagetable walk to get the physical
> > > address associated with given virtual address
> > >
> > > Reviewed-by: Hari Bathini <hbathini at linux.ibm.com>
> > > Signed-off-by: Aditya Gupta <adityag at linux.ibm.com>
> > >
> > > ---
> > >
> > > Testing
> > > =======
> > >
> > > Git tree with patch applied:
> > > https://github.com/adi-g15-ibm/crash/tree/bugzilla-203296-list-v1
> > >
> > > This can be tested with '/proc/vmcore' as the vmcore, since
> makedumpfile
> > >
> >
> > Can you help to describe in detail how to reproduce this issue? Or does
> > this require any kernel configs to be enabled first?  I did not reproduce
> > the current issue with '/proc/kcore' or vmcore(via cp).
> >
> > Test kernel commit: ce9ecca0238b ("Linux 6.6-rc2")
> >
> > # ./crash /home/linux/vmlinux
>
> Thanks for testing it.
>
> This issue occurs only in case of Radix MMU.
>
> Overall, these are all the requirements:
> 1. Upstream linux (master branch) (your commit will also work,
> ce9ecca0238b)
> 2. 'CONFIG_PPC_BOOK3S_64' should be 'y' in kernel config (this should be
> there
>    in default configs)
>

 # grep "CONFIG_PPC_BOOK3S_64" /home/linux/.config
CONFIG_PPC_BOOK3S_64=y

 3. Check in dmesg of the crashed kernel, if it prints 'hash-mmu' or
>    'radix-mmu'. It should be 'radix-mmu'.
>
>
# dmesg|grep mmu
[    0.000000] hash-mmu: Page sizes from device-tree:
[    0.000000] hash-mmu: base_shift=12: shift=12, sllp=0x0000,
avpnm=0x00000000, tlbiel=1, penc=0
[    0.000000] hash-mmu: base_shift=12: shift=16, sllp=0x0000,
avpnm=0x00000000, tlbiel=1, penc=7
[    0.000000] hash-mmu: base_shift=12: shift=24, sllp=0x0000,
avpnm=0x00000000, tlbiel=1, penc=56
[    0.000000] hash-mmu: base_shift=16: shift=16, sllp=0x0110,
avpnm=0x00000000, tlbiel=1, penc=1
[    0.000000] hash-mmu: base_shift=16: shift=24, sllp=0x0110,
avpnm=0x00000000, tlbiel=1, penc=8
[    0.000000] hash-mmu: base_shift=24: shift=24, sllp=0x0100,
avpnm=0x00000001, tlbiel=0, penc=0
[    0.000000] hash-mmu: base_shift=34: shift=34, sllp=0x0120,
avpnm=0x000007ff, tlbiel=0, penc=3
[    0.000000] hash-mmu: Initializing hash mmu with SLB
[    0.000000] mmu_features      = 0xfc006e01
[    0.000000] hash-mmu: ppc64_pft_size    = 0x1b
[    0.000000] hash-mmu: htab_hash_mask    = 0xfffff


> I guess, the system that was crashed might be using 'hash-mmu'.
>
> > also fails in absence of 'vmemmap_list' in upstream linux
>
> Yes, it will fail in Hash MMU case, as we depend on 'vmemmap_list' in that
> case,
> as the virtual to physical address mapping is not available in page table,
> in
> case of Hash-MMU.
>
> Only in radix MMU case, it will still work, even if 'vmemmap_list' is
> removed,
> since we have the mappings in kernel page table, which is used by this
> patch.
>
> Let me know if the issue still doesn't reproduce even after using a system
> with
> Radix MMU.
>
>
Yes, still not reproduce on my side. But, looks like we have the same
system with Radix MMU, it's strange.

Thanks.
Lianbo


> Thanks,
> - Aditya Gupta
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20230920/e849b0dd/attachment.htm>


More information about the Crash-utility mailing list