[Linux-cachefs] [ linux-next ] 20211206 tree cifs panic

Murphy Zhou jencce.kernel at gmail.com
Mon Jan 10 05:23:13 UTC 2022


Hi all,

It's still reproducible on the latest next-20210107 tree with below
reproducer.

Reverting this fscache update makes the panic gone.

  574146fe263a Merge branch 'fscache-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

Thanks,
Murphy

On Wed, Dec 22, 2021 at 10:16:44AM +0800, Murphy Zhou wrote:
> A bit late.
> 
> Try this:
> 
> rm -rf cthon04
> git clone git://git.linux-nfs.org/projects/steved/cthon04.git
> pushd cthon04
> make clean || exit
> make FSTYPE=cifs >cthon04-build.log 2>&1 || exit
> popd
> 
> rm -fr /tmp/connectathon
> mkdir -p /tmp/connectathon
> chmod 1777 /tmp/connectathon
> chcon -t samba_share_t /tmp/connectathon
> cat > /etc/samba/smb.conf <<EOF
> [global]
>     workgroup = EXAMPLE
>     unix extensions = no
>     ea support = Yes
>     min protocol = NT1
> 
> [testuser]
>     comment = testuser unix extensions off
>     path = /tmp/connectathon
>     read only = No
>     #acl check permissions = No
>     #acl map full control = No
> EOF
> echo -e 'redhat\nredhat' | smbpasswd -s -a root
> service smb restart
> sleep 5
> 
> pushd cthon04
> smbclient -L //$(hostname)/testuser -N
> echo "y\n" | ./server -o
> actimeo=0,user=root,password=redhat,domain=EXAMPLE,file_mode=0777,rw,noauto
> -C -a -f -p testuser -m /mnt/connectathon $(hostname) -b
> popd
> 
> 
> On Sat, Dec 11, 2021 at 11:33 AM Shyam Prasad N <nspmangalore at gmail.com> wrote:
> >
> >
> >
> > On Fri, Dec 10, 2021 at 11:37 AM Murphy Zhou <jencce.kernel at gmail.com> wrote:
> >>
> >> The patch can't be applied on the 1208 tree and does not fix the issue
> >> on the 1207 tree.
> >>
> > Hi Murphy,
> >
> > Which is the git repo and branch that you're using? Is it reproducible consistently?
> > And is it a ksmbd server? Or samba server? Can you share the conf file for that as well?
> > I'm unable to repro this issue.
> >
> > Regards,
> > Shyam
> >
> >> On Thu, Dec 9, 2021 at 7:05 PM Murphy Zhou <jencce.kernel at gmail.com> wrote:
> >> >
> >> > Test is running.
> >> >
> >> > And the kernel config is attached.
> >> >
> >> > Thanks for looking into this!
> >> >
> >> > On Thu, Dec 9, 2021 at 6:53 PM Shyam Prasad N <nspmangalore at gmail.com> wrote:
> >> > >
> >> > > On Thu, Dec 9, 2021 at 3:06 PM Shyam Prasad N <nspmangalore at gmail.com> wrote:
> >> > > >
> >> > > > On Thu, Dec 9, 2021 at 2:40 PM Shyam Prasad N <nspmangalore at gmail.com> wrote:
> >> > > > >
> >> > > > > Hi Murphy,
> >> > > > >
> >> > > > > Can you please share the kernel config file used for this test?
> >> > > > > Is cachefilesd configured on this test setup?
> >> > > > >
> >> > > > > Regards,
> >> > > > > Shyam
> >> > > > >
> >> > > > > On Wed, Dec 8, 2021 at 2:57 PM Murphy Zhou <jencce.kernel at gmail.com> wrote:
> >> > > > > >
> >> > > > > > Hi,
> >> > > > > >
> >> > > > > > A connectathon test triggers panic like below. The server is a  smb
> >> > > > > > share on the same server with the test client.
> >> > > > > >
> >> > > > > >
> >> > > > > > [  594.061343] Key type cifs.spnego registered
> >> > > > > > [  594.082337] Key type cifs.idmap registered
> >> > > > > > [  594.104961] CIFS: No dialect specified on mount. Default has
> >> > > > > > changed to a more secure dialect, SMB2.1 or later (e.g. SMB3.1.1),
> >> > > > > > from CIFS (SMB1). To use the less secure SMB1 dialect to access old
> >> > > > > > servers which do not support SMB3.1.1 (or even SMB3 or SMB2.1) specify
> >> > > > > > vers=1.0 on mount.
> >> > > > > > [  594.223460] CIFS: Attempting to mount \\hp-dl380pg8\testuser
> >> > > > > > [  594.287771] BUG: kernel NULL pointer dereference, address: 0000000000000000
> >> > > > > > [  594.319712] #PF: supervisor write access in kernel mode
> >> > > > > > [  594.343627] #PF: error_code(0x0002) - not-present page
> >> > > > > > [  594.366791] PGD 0 P4D 0
> >> > > > > > [  594.378172] Oops: 0002 [#1] PREEMPT SMP PTI
> >> > > > > > [  594.397047] CPU: 0 PID: 52196 Comm: mount.cifs Kdump: loaded
> >> > > > > > Tainted: G          I       5.16.0-rc4-next-20211206 #1
> >> > > > > > [  594.445144] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014
> >> > > > > > [  594.475201] RIP: 0010:cifs_fscache_get_inode_cookie+0x2f/0xb0 [cifs]
> >> > > > > > [  594.503934] Code: 53 48 89 fb 48 83 ec 20 65 48 8b 04 25 28 00 00
> >> > > > > > 00 48 89 44 24 18 48 8b 47 28 48 8b b8 88 03 00 00 e8 35 c6 fa ff 48
> >> > > > > > 8b 53 68 <48> 89 14 25 00 00 00 00 48 8b 53 70 89 14 25 10 00 00 00 48
> >> > > > > > 8b 53
> >> > > > > > [  594.590004] RSP: 0018:ffffb93c4998fc10 EFLAGS: 00010282
> >> > > > > > [  594.614861] RAX: ffff970743ab5000 RBX: ffff970411193168 RCX: 0000000000000000
> >> > > > > > [  594.650920] RDX: 0000000061b01059 RSI: 00000000000041ed RDI: ffff970453924780
> >> > > > > > [  594.686189] RBP: ffffb93c4998fce8 R08: ffff970411193168 R09: ffff970743ab1548
> >> > > > > > [  594.718776] R10: 000000009f8bdc24 R11: 000000009053e561 R12: 000000000e1c25d9
> >> > > > > > [  594.750925] R13: ffff970411193168 R14: ffff970743ab1000 R15: ffff970743ab5000
> >> > > > > > [  594.783532] FS:  00007f2037080780(0000) GS:ffff97072f600000(0000)
> >> > > > > > knlGS:0000000000000000
> >> > > > > > [  594.820129] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > > > > [  594.846183] CR2: 0000000000000000 CR3: 0000000141820006 CR4: 00000000001706f0
> >> > > > > > [  594.878376] Call Trace:
> >> > > > > > [  594.889469]  <TASK>
> >> > > > > > [  594.898870]  cifs_iget+0x14b/0x160 [cifs]
> >> > > > > > [  594.917781]  cifs_get_inode_info+0x430/0x750 [cifs]
> >> > > > > > [  594.941267]  ? __d_instantiate+0x34/0xf0
> >> > > > > > [  594.960012]  ? _raw_spin_unlock+0x16/0x30
> >> > > > > > [  594.978111]  ? d_instantiate+0x3e/0x60
> >> > > > > > [  594.994982]  cifs_root_iget+0x33b/0x4b0 [cifs]
> >> > > > > > [  595.015099]  cifs_read_super+0x125/0x200 [cifs]
> >> > > > > > [  595.035596]  cifs_smb3_do_mount+0x224/0x330 [cifs]
> >> > > > > > [  595.057009]  smb3_get_tree+0x2d/0x50 [cifs]
> >> > > > > > [  595.076065]  vfs_get_tree+0x25/0xb0
> >> > > > > > [  595.092562]  do_new_mount+0x176/0x310
> >> > > > > > [  595.110929]  __x64_sys_mount+0x103/0x140
> >> > > > > > [  595.130439]  do_syscall_64+0x3b/0x90
> >> > > > > > [  595.147929]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >> > > > > > [  595.172646] RIP: 0033:0x7f2037195c4e
> >> > > > > > [  595.188703] Code: 48 8b 0d dd 71 0e 00 f7 d8 64 89 01 48 83 c8 ff
> >> > > > > > c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
> >> > > > > > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 71 0e 00 f7 d8 64 89
> >> > > > > > 01 48
> >> > > > > > [  595.273644] RSP: 002b:00007fff27645a38 EFLAGS: 00000202 ORIG_RAX:
> >> > > > > > 00000000000000a5
> >> > > > > > [  595.307790] RAX: ffffffffffffffda RBX: 000055690a1bb910 RCX: 00007f2037195c4e
> >> > > > > > [  595.340187] RDX: 0000556908d5946b RSI: 0000556908d594b6 RDI: 00007fff27647fbe
> >> > > > > > [  595.372419] RBP: 000055690a1bb8f0 R08: 000055690a1bb910 R09: 0000000000000077
> >> > > > > > [  595.404633] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff27647fb3
> >> > > > > > [  595.436882] R13: 00007f203729d000 R14: 00007f203729f70e R15: 00007fff27647fbe
> >> > > > > > [  595.468980]  </TASK>
> >> > > > > > [  595.478769] Modules linked in: cifs cifs_arc4 cifs_md4 loop nfsv3
> >> > > > > > rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs rpcrdma rdma_cm
> >> > > > > > iw_cm ib_cm ib_core nfsd auth_rpcgss nfs_acl lockd grace rfkill sunrpc
> >> > > > > > intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal
> >> > > > > > intel_powerclamp mgag200 coretemp i2c_algo_bit kvm_intel
> >> > > > > > drm_shmem_helper drm_kms_helper ipmi_ssif iTCO_wdt kvm
> >> > > > > > iTCO_vendor_support acpi_ipmi syscopyarea irqbypass sysfillrect
> >> > > > > > ipmi_si rapl intel_cstate ioatdma ipmi_devintf sysimgblt intel_uncore
> >> > > > > > fb_sys_fops cec lpc_ich ipmi_msghandler acpi_power_meter pcspkr dca
> >> > > > > > hpilo drm fuse xfs libcrc32c sr_mod cdrom sd_mod ata_generic t10_pi sg
> >> > > > > > ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata serio_raw
> >> > > > > > tg3 ghash_clmulni_intel hpsa hpwdt scsi_transport_sas dm_mirror
> >> > > > > > dm_region_hash dm_log dm_mod
> >> > > > > > [  595.821049] CR2: 0000000000000000
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Regards,
> >> > > > > Shyam
> >> > > >
> >> > > > This does not repro against a Windows server.
> >> > > > My suspicion is that the recent change of location of
> >> > > > cifs_fscache_get_super_cookie to cifs_root_iget caused this. We maybe
> >> > > > trying to initialize the inode cookie when the super cookie is yet to
> >> > > > be initialized.
> >> > > >
> >> > > > The bigger point here is that there seems to be a circular dependency:
> >> > > > We need tcon->resource_id to setup the super cookie. This is populated
> >> > > > using inode number of root directory. Getting this inode number needs
> >> > > > opening of the root dir. Open causes inode cookie to be initialized,
> >> > > > which trips when it sees that the super cookie is still NULL.
> >> > > >
> >> > > > Steve: Do you agree with this assessment? How do we fix this? Can we
> >> > > > use some other value for resource_id, and not have to rely on the root
> >> > > > inode number? How about tcon->tid? Or a combination of tcon->tid and
> >> > > > ses->Suid?
> >> > > >
> >> > > > --
> >> > > > Regards,
> >> > > > Shyam
> >> > >
> >> > > Hi Murphy,
> >> > >
> >> > > Will you be able to test out with this patch as a possible fix for this issue?
> >> > >
> >> > > --
> >> > > Regards,
> >> > > Shyam
> >
> >
> >
> > --
> > Regards,
> > Shyam

-- 
Murphy




More information about the Linux-cachefs mailing list