[Freeipa-users] General status of my FreeIPA servers - is there a method for cleaning them?

Rich Megginson rmeggins at redhat.com
Fri Apr 13 19:24:36 UTC 2012


On 04/13/2012 01:03 PM, Dan Scott wrote:
> Thanks for the quick response.
>
> Simo: Thanks - I'd prefer to clean it up properly rather than start
> from scratch. I haven't changed the LDAP schema at all. All I've done
> is the use the IPA tools for user admin and add/remove replicas.
>
> I just felt like I've been emailing this list once a week or so for
> the past few months - I was beginning to think that it was beyond
> repair! :)
>
> On Fri, Apr 13, 2012 at 14:38, Rich Megginson<rmeggins at redhat.com>  wrote:
>> On 04/13/2012 12:22 PM, Dan Scott wrote:
>>> On Fri, Apr 13, 2012 at 13:43, Rich Megginson<rmeggins at redhat.com>    wrote:
>>>> On 04/13/2012 11:39 AM, Dan Scott wrote:
>>>>> I'm convinced that my LDAP directories contain lots of cruft which has
>>>>> built up and is causing problems on my system. There may even be some
>>>>> corruption since there's an entry which I'm unable to remove - this
>>>>> entry does not get replicated to the other servers.
>>>>
>>>> What version of 389-ds-base is this?  Do you get any errors?  It just
>>>> silently fails to delete this particular entry?
>>> [root at fileserver1 ~]# rpm -qa|grep 389
>>> 389-ds-base-libs-1.2.10.4-2.fc16.x86_64
>>> 389-ds-base-1.2.10.4-2.fc16.x86_64
>>> [root at fileserver1 ~]#ldapmodify -f rmfileserver5.ldif -D 'cn=directory
>>> manager' -W
>>> Enter LDAP Password:
>>> deleting entry
>>> "cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu"
>>> ldap_delete: Operation not allowed on non-leaf (66)
>>>
>>> [root at fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -v -b
>>> 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
>>> '(objectclass=*)'
>>> ldap_initialize(<DEFAULT>    )
>>> Enter LDAP Password:
>>> filter: (objectclass=*)
>>> requesting: All userApplication attributes
>>> # extended LDIF
>>> #
>>> # LDAPv3
>>> #
>>> base<cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu>
>>> with scope subtree
>>> # filter: (objectclass=*)
>>> # requesting: ALL
>>> #
>>>
>>> # fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu
>>> dn:
>>> cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
>>> cn: fileserver5.ecg.mit.edu
>>> objectClass: top
>>> objectClass: nsContainer
>>>
>>> # search result
>>> search: 2
>>> result: 0 Success
>>>
>>> # numResponses: 2
>>> # numEntries: 1
>>> [root at fileserver1 ~]#
>>>
>>> If I'm interpreting this correctly, it can't be deleted because it's
>>> not a leaf node, but it doesn't have any sub-entries that I can delete
>>> first.
>>
>> You are correct.  Try this:
>>
>> ldapsearch -D 'cn=directory manager' -W -v -b
>> 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
>> '(|(objectclass=nstombstone)(objectclass=*))'
> Ahh, so there are some 'child' entries:
>
> [root at fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -b
> 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
> '(|(objectclass=nstombstone)(objectclass=*))'
> Enter LDAP Password:
> # extended LDIF
> #
> # LDAPv3
> # base<cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu>
> with scope subtree
> # filter: (|(objectclass=nstombstone)(objectclass=*))
> # requesting: ALL
> #
>
> # aaa2c704-63cf11e1-ac8dadbd-35182efb, fileserver5.ecg.mit.edu, masters, ipa,
>    etc, ecg.mit.edu
> dn: nsuniqueid=aaa2c704-63cf11e1-ac8dadbd-35182efb,cn=fileserver5.ecg.mit.edu,
>   cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> objectClass: top
> objectClass: nsContainer
> objectClass: nsTombstone
> cn: fileserver5.ecg.mit.edu
> nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d
>
> # 17708e04-63dd11e1-9b079095-05c635b0, fileserver5.ecg.mit.edu, masters, ipa,
>    etc, ecg.mit.edu
> dn: nsuniqueid=17708e04-63dd11e1-9b079095-05c635b0,cn=fileserver5.ecg.mit.edu,
>   cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> objectClass: top
> objectClass: nsContainer
> objectClass: nsTombstone
> cn: fileserver5.ecg.mit.edu
> nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d
>
> # 5ceb8604-63f211e1-bc108552-1fbf39e2, fileserver5.ecg.mit.edu, masters, ipa,
>    etc, ecg.mit.edu
> dn: nsuniqueid=5ceb8604-63f211e1-bc108552-1fbf39e2,cn=fileserver5.ecg.mit.edu,
>   cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> objectClass: top
> objectClass: nsContainer
> objectClass: nsTombstone
> cn: fileserver5.ecg.mit.edu
> nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d
>
> # fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu
> dn: cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> cn: fileserver5.ecg.mit.edu
> objectClass: top
> objectClass: nsContainer
>
> # c480f184-83f011e1-90d1df13-bba55eff, HTTP, fileserver5.ecg.mit.edu, masters
>   , ipa, etc, ecg.mit.edu
> dn: nsuniqueid=c480f184-83f011e1-90d1df13-bba55eff,cn=HTTP,cn=fileserver5.ecg.
>   mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> objectClass: nsContainer
> objectClass: ipaConfigObject
> objectClass: top
> objectClass: nsTombstone
> ipaConfigString: enabledService
> ipaConfigString: startOrder 40
> cn: HTTP
> nsParentUniqueId: 1eba8a03-642311e1-9b95afe9-fc1b53ef
>
> # search result
> search: 2
> result: 0 Success
>
> # numResponses: 6
> # numEntries: 5
>
> Is it safe to delete them?
Yes.
>
>>>>> I also see
>>>>> inconsistent replication states on the servers. i.e. server1 shows
>>>>> that it's replicating with server2 but server2 does not show that it's
>>>>> replicating with server1.
>>>>
>>>> Do you have errors in the server2 log showing that it is attempting to
>>>> replicate with server1 but failing with some error?
>>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver1.ecg.mit.edu
>>> Directory Manager password:
>>>
>>> fileserver2.ecg.mit.edu
>>>    last init status: None
>>>    last init ended: None
>>>    last update status: 0 Replica acquired successfully: Incremental
>>> update succeeded
>>>    last update ended: 2012-04-13 17:57:39+00:00
>>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver2.ecg.mit.edu
>>> Directory Manager password:
>>>
>>> fileserver1.ecg.mit.edu
>>>    last init status: None
>>>    last init ended: None
>>>    last update status: 0 Replica acquired successfully: Incremental
>>> update succeeded
>>>    last update ended: 2012-04-13 17:57:41+00:00
>>> fileserver3.ecg.mit.edu
>>>    last init status: None
>>>    last init ended: None
>>>    last update status: 0 Replica acquired successfully: Incremental
>>> update succeeded
>>>    last update ended: 2012-04-13 17:57:41+00:00
>>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver3.ecg.mit.edu
>>> Directory Manager password:
>>>
>>> fileserver2.ecg.mit.edu
>>>    last init status: None
>>>    last init ended: None
>>>    last update status: 0 Replica acquired successfully: Incremental
>>> update succeeded
>>>    last update ended: 2012-04-13 17:57:44+00:00
>>> fileserver1.ecg.mit.edu
>>>    last init status: None
>>>    last init ended: None
>>>    last update status: 0 Replica acquired successfully: Incremental
>>> update succeeded
>>>    last update ended: 2012-04-13 17:57:43+00:00
>>> [root at fileserver1 ~]#
>>>
>>> fileserver1's (and fileserver2s) /var/log/dirsrv/slapd-PKI-IPA/errors
>>> contains lots of:
>>> [13/Apr/2012:13:57:43 -0400] NSMMReplicationPlugin -
>>> repl_set_mtn_referrals: could not set referrals for replica o=ipaca:
>>> 20
>>
>> This error usually means a replica was deleted and the RUV needs to be
>> cleaned.
>> see http://port389.org/wiki/Howto:CLEANRUV
>> and
>> https://fedorahosted.org/freeipa/ticket/2303
>> and
>> https://fedorahosted.org/389/ticket/337
> OK, I've seen this before - is it important to remove them? I've had
> to add and remove replicas so much that I don't really want to do it
> unless it's necessary. I'm happy to live with them if it's not a
> problem.

It's not a problem until it's a problem :-)  I would go ahead and run 
CLEANRUV.

>
>>> fileserver3's /var/log/dirsrv/slapd-PKI-IPA/errors contains lots of:
>>> [13/Apr/2012:13:52:50 -0400] slapi_ldap_bind - Error: could not send
>>> startTLS request: error -1 (Can't contact LDAP server) errno 107
>>> (Transport endpoint is not connected)
>>
>> This is a real connection error - could be cert or hostname lookup related.
> How do I find out if it's cert or hostname lookup? Which hostname?
> Fileserver3 runs DNS, and it seems to be working fine.

Try ldapsearch - on server3

LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-PKI-IPA ldapsearch -x -ZZ -H 
ldap://server2.fqdn -D "cn=directory manager" -W -s base -b ""

If that works, check to make sure the replication agreement has the 
correct server2.fqdn

If that doesn't work, use ldapsearch -d 1 -x ..... to get further 
debugging information.

>
>>> [13/Apr/2012:13:57:39 -0400] NSMMReplicationPlugin -
>>> repl_set_mtn_referrals: could not set referrals for replica o=ipaca:
>>> 20
>>>
>>> fileserver2's non-PKI replication agreements to both fileserver1 and 3
>>> are in place, but both say: Incremental update has failed and requires
>>> administrator actionSystem error.
>>
>>
>>> When I try to re-initialize:
>>>
>>> [root at fileserver2 ~]# ipa-replica-manage re-initialize --from
>>> fileserver3.ecg.mit.edu
>>> Directory Manager password:
>>>
>>> [fileserver3.ecg.mit.edu] reports: Replica Busy! Status: [1
>>> Replication error acquiring replica: replica busy]
>>
>> This is a transient condition.
> Fileserver2 is busy?

Yes.

> The /var/log/dirsrv/slapd-ECG-MIT-EDU/errors is
> now full of:
>
> [13/Apr/2012:14:59:19 -0400] NSMMReplicationPlugin - conn=1 op=571
> csn=4f70a9e5000100060000: Can't created glue entry
> cn=fileserver4.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
> uniqueid=6949d104-775b11e1-abce82a1-a45dd3c3, error 68
>
> Should I delete the LDAP entry which is trying to replicate
> fileserver2 with fileserver4?

Yes.  And it may be due to the fact that the entry it is trying to 
delete has those tombstone children that have to be deleted too.

>
>>> this command has been running for 1/2hr and produced no more output
>>> (fileserver2 is the remaining server running Fedora 15, the others are
>>> Fedora 16 with latest updates).
>>
>> Not sure how ipa-replica-manage handles busy - does it keep trying until it
>> is not busy?
>>
>>
>>>>> Is there some way that I can refresh/clean my LDAP directories and
>>>>> ensure that everything's running correctly.
>>>> We first need to find out what's going on and why you are seeing these
>>>> failures before we can recommend a particular course of action.  There is
>>>> currently no "find all of my problems and fix them" command.
>>> :) Wish there was. It's just that I've been having lots of problems
>>> recently and I was thinking that there is something fundamentally
>>> wrong with my installation. I keep having to ask you guys for help.
>>
>> I think some of these problems were due to the fact that an alpha version of
>> 389 got pushed to the Stable repo in F-16, and in between that alpha version
>> and the real "Stable" version we were forced to change the database format
>> to fix a serious issue, and that introduced some inconsistencies into the
>> database upon upgrade.
> Yeah, I think most of my troubles have started since that version.
> Hope I can get it fixed! :)
>
>>> An additional problem, which Rob Crittenden is helping with is that
>>> I'm trying to install another replica (fileserver4) which fails when
>>> setting up the CA:
>>>
>>> 2012-04-11 11:30:47,289 CRITICAL failed to configure ca instance
>>> Command '/usr/bin/perl /usr/bin/pkisilent 'ConfigureCA' '-cs_hostname'
>>> 'fileserver4.ecg.mit.edu' '-cs_port' '9445' '-client_certdb_dir'
>>> '/tmp/tmp-JJIkrk' '-client_certdb_pwd' XXXXXXXX '-preop_pin'
>>> 'LI1En8UwjZ2BYDcnu8nJ' '-domain_name' 'IPA' '-admin_user' 'admin'
>>> '-admin_email' 'root at localhost' '-admin_password' XXXXXXXX
>>> '-agent_name' 'ipa-ca-agent' '-agent_key_size' '2048'
>>> '-agent_key_type' 'rsa' '-agent_cert_subject'
>>> 'CN=ipa-ca-agent,O=ECG.MIT.EDU' '-ldap_host' 'fileserver4.ecg.mit.edu'
>>> '-ldap_port' '7389' '-bind_dn' 'cn=Directory Manager' '-bind_password'
>>> XXXXXXXX '-base_dn' 'o=ipaca' '-db_name' 'ipaca' '-key_size' '2048'
>>> '-key_type' 'rsa' '-key_algorithm' 'SHA256withRSA' '-save_p12' 'true'
>>> '-backup_pwd' XXXXXXXX '-subsystem_name' 'pki-cad' '-token_name'
>>> 'internal' '-ca_subsystem_cert_subject_name' 'CN=CA
>>> Subsystem,O=ECG.MIT.EDU' '-ca_ocsp_cert_subject_name' 'CN=OCSP
>>> Subsystem,O=ECG.MIT.EDU' '-ca_server_cert_subject_name'
>>> 'CN=fileserver4.ecg.mit.edu,O=ECG.MIT.EDU'
>>> '-ca_audit_signing_cert_subject_name' 'CN=CA Audit,O=ECG.MIT.EDU'
>>> '-ca_sign_cert_subject_name' 'CN=Certificate Authority,O=ECG.MIT.EDU'
>>> '-external' 'false' '-clone' 'true' '-clone_p12_file' 'ca.p12'
>>> '-clone_p12_password' XXXXXXXX '-sd_hostname'
>>> 'fileserver3.ecg.mit.edu' '-sd_admin_port' '443' '-sd_admin_name'
>>> 'admin' '-sd_admin_password' XXXXXXXX '-clone_start_tls' 'true'
>>> '-clone_uri' 'https://fileserver3.ecg.mit.edu:443'' returned non-zero
>>> exit status 255
>>>
>>> Sorry to dump a tonne of problems in one go, but you can see why I
>>> think there's something (probably several things) badly wrong with my
>>> installation. I guess I was looking for a few very basic things to
>>> check to ensure that the servers are fundamentally configured
>>> properly.
>>
>> Unfortunately, it appears that some of your problems are unexpected and/or
>> have not been seen before.
> Hopefully I can fix them, as long as you don't mind my endless emails
> to the list.... :)

At some point, you may run into diminishing returns trying to fix your 
current broken installation - that is, the time spent playing 
whack-a-mole with these problems might be better spent starting over 
from scratch . . .

>
> Thanks,
>
> Dan




More information about the Freeipa-users mailing list