Kernel oops+crash on repeated auditd restarts

Valentin Avram aval13 at gmail.com
Mon Mar 5 08:35:20 UTC 2012


Finally i found some time and spare server to retest the oops and list_add
corruptions i was getting with the 3.x kernels and auditd 2.1.3.

I tested now with gentoo's latest stable 3.2.1-gentoo-r2 and kernel.org's
3.2.9.

Both get the oops/BUG in the same way and after that, they keep pouring
list_add corruptions with audit_prune_tre(truncated?) and auditctl as comms.

Since this is not about Gentoo's kernel only, i'll post here the oops in
3.2.9 and also attach some list_add corruptions.

3.2.9 BUG:

kernel: [  301.240011] BUG: unable to handle kernel NULL pointer
dereference at   (null)
kernel: [  301.240305] IP: [<c1238dd0>] __list_del_entry+0x20/0xe0
kernel: [  301.240481] *pdpt = 0000000000000000 *pde = f000ddc8f000ddc8
kernel: [  301.240698] Oops: 0000 [#1] SMP
kernel: [  301.240910]
kernel: [  301.241030] Pid: 642, comm: fsnotify_mark Not tainted
3.2.9-drbd-version3 #1 Dell Inc. PowerEdge 2950/0CX396
kernel: [  301.241370] EIP: 0060:[<c1238dd0>] EFLAGS: 00010287 CPU: 6
kernel: [  301.241498] EIP is at __list_del_entry+0x20/0xe0
kernel: [  301.241623] EAX: f4fae544 EBX: f47cffa4 ECX: ffffffff EDX: 00000000
kernel: [  301.241751] ESI: f4fae544 EDI: f4fae508 EBP: f47cff7c ESP: f47cff64
kernel: [  301.241879]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
kernel: [  301.242005] Process fsnotify_mark (pid: 642, ti=f47ce000
task=f4f47c00 task.ti=f47ce000)
kernel: [  301.242207] Stack:
kernel: [  301.242327]  c10813c0 f47cffa4 f4f47c00 f4e70888 f47cff7c
f47cffa4 f47cffb8 c10f6976
kernel: [  301.242882]  ffffffc3 f4f47c00 f4f47c00 00000000 f4f47c00
c10530c0 f47cff9c f47cff9c
kernel: [  301.243438]  f4fae544 f4fae544 f4c47f58 00000000 c10f68f0
f47cffe4 c1052834 00000000
kernel: [  301.243995] Call Trace:
kernel: [  301.244119]  [<c10813c0>] ? rcu_check_callbacks+0x110/0x110
kernel: [  301.244248]  [<c10f6976>] fsnotify_mark_destroy+0x86/0x120
kernel: [  301.244377]  [<c10530c0>] ? abort_exclusive_wait+0x80/0x80
kernel: [  301.244504]  [<c10f68f0>] ? fsnotify_put_mark+0x30/0x30
kernel: [  301.244631]  [<c1052834>] kthread+0x74/0x80
kernel: [  301.244756]  [<c10527c0>] ? kthread_flush_work_fn+0x10/0x10
kernel: [  301.244885]  [<c1582ab6>] kernel_thread_helper+0x6/0xd
kernel: [  301.245011] Code: 55 f4 8b 45 f8 e9 75 ff ff ff 90 55 89 e5
53 83 ec 14 8b 08 8b 50 04 81 f9 00 01 10 00 74 24 81 fa 00 02 20 00
0f 84 8e 00 00 00 <8b> 1a 39 d8 75 62 8b 59 04 39 d8 75 35 89 51 04 89
0a 83 c4 14
kernel: [  301.248195] EIP: [<c1238dd0>] __list_del_entry+0x20/0xe0
SS:ESP 0068:f47cff64
kernel: [  301.248414] CR2: 0000000000000000
kernel: [  301.248538] ---[ end trace 15082dbfb353f84c ]---

The kernel was compiled with the following DEBUG support (the bolded one
were requested by Gentoo's Dev:
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_SLUB_DEBUG=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_PNP_DEBUG_MESSAGES=y
CONFIG_AIC94XX_DEBUG=y
CONFIG_USB_DEBUG=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_BUGVERBOSE=y
*CONFIG_DEBUG_INFO=y*
CONFIG_DEBUG_MEMORY_INIT=y
*CONFIG_DEBUG_LIST=y*
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_RODATA=y
CONFIG_DEBUG_RODATA_TEST=y

I attached the kernel config i used for 3.2.9 to generate this oops and
warnings.

>From the list_add warnings that come after, out of 805 warnings i
processed, after masking with XXXXX the PID and next= values that kept
changing in every one, i got 26 types of MD5. I also attached the files
relevant as an archive to this email.

The Gentoo bug i opened is sleeping, it seems nobody has the time to at
least test to confirm or not the problems i'm seeing (or everybody's
thinking that nobody would restart auditd so often, so the bug it's not
that serious).

Thank you for your time.

On Wed, Feb 8, 2012 at 6:11 PM, Valentin Avram <aval13 at gmail.com> wrote:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-audit/attachments/20120305/54bd6b73/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_oops.tgz
Type: application/x-gzip
Size: 1783 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-audit/attachments/20120305/54bd6b73/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kernel_config.gz
Type: application/x-gzip
Size: 15572 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-audit/attachments/20120305/54bd6b73/attachment-0001.bin>


More information about the Linux-audit mailing list