[Freeipa-users] DirSrv hanging

Adam Bishop Adam.Bishop at jisc.ac.uk
Sat Jan 7 05:19:42 UTC 2017


I have a standalone FreeIPA instance that is becoming unresponsive every few hours. While in this state it will accept connections, but will not do anything with them (i.e. if you connect an ldaps client to 636, you see SYN->SYNACK->ACK->ClientHello, but a ServerHello is not returned). This system is running FreeIPA 4.4.0 currently, but this also occurred on 4.2.x. Time is synchronised correctly and this is a fairly new installation so all the PKI expiry dates are well into the future.

It handles queries without complaint, right up until the point it doesn't.

Inspecting the process with strace shows it waiting on a socket:

    getpeername(7, 0x7ffeb749af70, [112])   = -1 ENOTCONN (Transport endpoint is not connected)
    poll([{fd=50, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, 
    {fd=66, events=POLLIN}, {fd=80, events=POLLIN}, {fd=79, events=POLLIN}, {fd=78, events=POLLIN}, 
    {fd=77, events=POLLIN}, {fd=76, events=POLLIN}, {fd=75, events=POLLIN}, {fd=73, events=POLLIN}, 
    {fd=71, events=POLLIN}, {fd=70, events=POLLIN}, {fd=68, events=POLLIN}], 15, 250) = 0 (Timeout)

fd 7 is a constant:

    ls -l /proc/2428/fd
    lrwx------. 1 root root 64 Jan  6 17:16 7 -> socket:[18972]

I'm not sure if I'm understanding the meaning of the fd entry correctly, but I believe this is the entry:

    [root at ldap-001 log]# lsof -p 2428 | grep 18972
    ns-slapd 2428 dirsrv    7u  IPv6              18972      0t0      TCP *:ldaps (LISTEN)

A backtrace from GDB follows at the end of this message -  it shows the address struct, which just contains the source address of the last connection to port 636 before DirSrv hangs.

The server is configured to use the FreeIPA dns service as its own resolver. The DNS service is definitely still running, and resolves the query fine when executed with dig.

There is nothing in the DirSrv logs that indicates an issue. The KDC logs indicate a problem, but I i don't know if DirSrv is hanging because of the KDC, or if the KDC is just reflecting that DirSrv is unresponsive.

    Jan 06 21:53:29 ldap-001.domain krb5kdc[2702](info): AS_REQ (6 etypes {18 17 16 23 25 26}) 193.63.63.108: LOOKING_UP_CLIENT: host/ldap-001.domain at DOMAIN for krbtgt/DOMAIN at DOMAIN, Server error
    Jan 06 21:53:29 ldap-001.domain krb5kdc[2702](info): closing down fd 12

sssd reports an issue too, but that is almost certainly due to an unresponsive DirSrv:

    (Sat Jan  7 03:16:08 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]

I'm not really sure what to check next - all the individual components seem to be working, but not together.

Any suggestions are appreciated.

Regards,

Adam Bishop

  gpg: E75B 1F92 6407 DFDF 9F1C  BF10 C993 2504 6609 D460

jisc.ac.uk

---

[root at ldap-001 log]# gdb -p 2428
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching to process 2428
0x00007fc80bf4fdfd in poll () at ../sysdeps/unix/syscall-template.S:81
81	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
Missing separate debuginfos, use: debuginfo-install ipa-server-4.4.0-14.el7.centos.1.1.x86_64
(gdb) break getpeername
Breakpoint 1 at 0x7fc80bf5b4b0: file ../sysdeps/unix/syscall-template.S, line 81.
(gdb) cont
Continuing.

Breakpoint 1, getpeername () at ../sysdeps/unix/syscall-template.S:81
81	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt full
#0  getpeername () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x00007fc80c888389 in pt_GetPeerName (fd=0x7fc810d92010, addr=0x7ffeb749af70) at ../../../nspr/pr/src/pthreads/ptio.c:2795
        rv = -1
        addr_len = 112
#2  0x00007fc80d3fec23 in ssl_Poll (fd=0x7fc810b69260, how_flags=<optimized out>, p_out_flags=0x7ffeb749b06c) at sslsock.c:2639
        ss = 0x7fc810d94f30
        new_flags = 1
        addr = {raw = {family = 0, data = '\000' <repeats 13 times>}, inet = {family = 0, port = 0, ip = 0, pad = "\000\000\000\000\000\000\000"}, ipv6 = {family = 0, port = 0, flowinfo = 0,
            ip = {_S6_un = {_S6_u8 = '\000' <repeats 15 times>, _S6_u16 = {0, 0, 0, 0, 0, 0, 0, 0}, _S6_u32 = {0, 0, 0, 0}, _S6_u64 = {0, 0}}}, scope_id = 0}, local = {family = 0,
            path = '\000' <repeats 30 times>, "\061\071\063.63.63.108\000\000\000`\327!\f\310\177\000\000\017\000\000\000\000\000\000\000p\260I\267\376\177\000\000\000\000\000\000\000\000\000\000\372", '\000' <repeats 15 times>, "\372\000\000\000\000\000\000\000\215", <incomplete sequence \343>}}
#3  0x00007fc80c887a45 in _pr_poll_with_poll (pds=0x7fc811256b40, npds=15, timeout=timeout at entry=250) at ../../../nspr/pr/src/pthreads/ptio.c:3812
        in_flags_read = 0
        in_flags_write = 0
        out_flags_read = 0
        out_flags_write = 0
        stack_syspoll = {{fd = 50, events = 1, revents = 0}, {fd = 6, events = 1, revents = 0}, {fd = 7, events = 1, revents = 0}, {fd = 8, events = 1, revents = 0}, {fd = 66, events = 1,
            revents = 0}, {fd = 80, events = 1, revents = 0}, {fd = 79, events = 1, revents = 0}, {fd = 78, events = 1, revents = 0}, {fd = 77, events = 1, revents = 0}, {fd = 76, events = 1,
            revents = 0}, {fd = 75, events = 1, revents = 0}, {fd = 73, events = 1, revents = 0}, {fd = 71, events = 1, revents = 0}, {fd = 70, events = 1, revents = 0}, {fd = 68, events = 1,
            revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 1219907217, events = -32767, revents = -1}, {fd = 2, events = 32766, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0,
            events = 0, revents = 0}, {fd = 48, events = 91, revents = 0}, {fd = -1219907216, events = 32766, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {
            fd = 110, events = 119, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = -1219907217, events = 32766, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = -1219907201,
            events = 32766, revents = 0}, {fd = 203544416, events = 32712, revents = 0}, {fd = 124, events = 0, revents = 0}, {fd = 2560, events = 0, revents = 0}, {fd = 1219907089,
            events = -32767, revents = -1}, {fd = 3, events = 32712, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 48, events = 91, revents = 0}, {
            fd = -1219907088, events = 32766, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 110, events = 119, revents = 0}, {fd = 0, events = 0,
            revents = 0}, {fd = -1219907089, events = 32766, revents = 0}, {fd = 210264088, events = 32712, revents = 0}, {fd = 1, events = 0, revents = 0}, {fd = 287047696, events = 32712,
            revents = 0}, {fd = -1, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0,
            revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 287320512, events = 32712, revents = 0}, {fd = 210265391,
            events = 32712, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 281542400, events = 32712, revents = 0}, {fd = 287320512, events = 32712, revents = 0}, {fd = -133551240,
            events = 32711, revents = 0}, {fd = 0, events = 0, revents = 0}, {fd = 246979857, events = 32712, revents = 0}, {fd = 5, events = 15, revents = 0}, {fd = -1219906728, events = 32766,
            revents = 0}}
        syspoll = 0x7ffeb749b070
        index = 2
        msecs = <optimized out>
        ready = 0
        start = <optimized out>
        elapsed = <optimized out>
        remaining = <optimized out>
#4  0x00007fc80c88a655 in PR_Poll (pds=<optimized out>, npds=<optimized out>, timeout=timeout at entry=250) at ../../../nspr/pr/src/pthreads/ptio.c:4324
No locals.
#5  0x00007fc80eb8d789 in slapd_daemon (ports=ports at entry=0x7ffeb749b630) at ldap/servers/slapd/daemon.c:1242
        select_return = 0
        prerr = <optimized out>
        n_tcps = 0x7fc810b6db30
        s_tcps = 0x7fc810b6da30
        i_unix = 0x7fc810b6da10
        fdesp = 0x0
        num_poll = 15
        pr_timeout = 250
        time_thread_p = 0x7fc8111ff350
        threads = <optimized out>
        in_referral_mode = 0
        tp = 0x0
        tp_config = {init_flag = 1219906497, initial_threads = -32767, max_threads = 9, stacksize = 0, event_queue_size = 2, work_queue_size = 0, log_fct = 0x0,
          log_start_fct = 0xffff800148b64ba1, log_close_fct = 0x7ffe0000000a, malloc_fct = 0x2, calloc_fct = 0x0, realloc_fct = 0x5b00000032, free_fct = 0x7ffeb749b460}
#6  0x00007fc80eb7f253 in main (argc=5, argv=0x7ffeb749bc68) at ldap/servers/slapd/main.c:1143
        return_value = 0
        slapdFrontendConfig = <optimized out>
        ports_info = {n_port = 389, s_port = 636, n_listenaddr = 0x7fc810b6dc40, s_listenaddr = 0x7fc810b6dba0, n_socket = 0x7fc810b6db30, i_listenaddr = 0x7fc810b6db50, i_port = 1,
          i_socket = 0x7fc810b6da10, s_socket = 0x7fc810b6da30}
        m = <optimized out>
        notify = <optimized out>

Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.

Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 2881024, VAT number GB 197 0632 86. The registered office is: One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.  





More information about the Freeipa-users mailing list