[Freeipa-users] General status of my FreeIPA servers - is there a method for cleaning them?

Dan Scott danieljamesscott at gmail.com
Fri Apr 13 19:03:20 UTC 2012


Thanks for the quick response.

Simo: Thanks - I'd prefer to clean it up properly rather than start
from scratch. I haven't changed the LDAP schema at all. All I've done
is the use the IPA tools for user admin and add/remove replicas.

I just felt like I've been emailing this list once a week or so for
the past few months - I was beginning to think that it was beyond
repair! :)

On Fri, Apr 13, 2012 at 14:38, Rich Megginson <rmeggins at redhat.com> wrote:
> On 04/13/2012 12:22 PM, Dan Scott wrote:
>>
>> On Fri, Apr 13, 2012 at 13:43, Rich Megginson<rmeggins at redhat.com>  wrote:
>>>
>>> On 04/13/2012 11:39 AM, Dan Scott wrote:
>>>>
>>>> I'm convinced that my LDAP directories contain lots of cruft which has
>>>> built up and is causing problems on my system. There may even be some
>>>> corruption since there's an entry which I'm unable to remove - this
>>>> entry does not get replicated to the other servers.
>>>
>>>
>>> What version of 389-ds-base is this?  Do you get any errors?  It just
>>> silently fails to delete this particular entry?
>>
>> [root at fileserver1 ~]# rpm -qa|grep 389
>> 389-ds-base-libs-1.2.10.4-2.fc16.x86_64
>> 389-ds-base-1.2.10.4-2.fc16.x86_64
>> [root at fileserver1 ~]#ldapmodify -f rmfileserver5.ldif -D 'cn=directory
>> manager' -W
>> Enter LDAP Password:
>> deleting entry
>> "cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu"
>> ldap_delete: Operation not allowed on non-leaf (66)
>>
>> [root at fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -v -b
>> 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
>> '(objectclass=*)'
>> ldap_initialize(<DEFAULT>  )
>> Enter LDAP Password:
>> filter: (objectclass=*)
>> requesting: All userApplication attributes
>> # extended LDIF
>> #
>> # LDAPv3
>> #
>> base<cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu>
>> with scope subtree
>> # filter: (objectclass=*)
>> # requesting: ALL
>> #
>>
>> # fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu
>> dn:
>> cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
>> cn: fileserver5.ecg.mit.edu
>> objectClass: top
>> objectClass: nsContainer
>>
>> # search result
>> search: 2
>> result: 0 Success
>>
>> # numResponses: 2
>> # numEntries: 1
>> [root at fileserver1 ~]#
>>
>> If I'm interpreting this correctly, it can't be deleted because it's
>> not a leaf node, but it doesn't have any sub-entries that I can delete
>> first.
>
>
> You are correct.  Try this:
>
> ldapsearch -D 'cn=directory manager' -W -v -b
> 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
> '(|(objectclass=nstombstone)(objectclass=*))'

Ahh, so there are some 'child' entries:

[root at fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -b
'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu'
'(|(objectclass=nstombstone)(objectclass=*))'
Enter LDAP Password:
# extended LDIF
#
# LDAPv3
# base <cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu>
with scope subtree
# filter: (|(objectclass=nstombstone)(objectclass=*))
# requesting: ALL
#

# aaa2c704-63cf11e1-ac8dadbd-35182efb, fileserver5.ecg.mit.edu, masters, ipa,
  etc, ecg.mit.edu
dn: nsuniqueid=aaa2c704-63cf11e1-ac8dadbd-35182efb,cn=fileserver5.ecg.mit.edu,
 cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
objectClass: top
objectClass: nsContainer
objectClass: nsTombstone
cn: fileserver5.ecg.mit.edu
nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d

# 17708e04-63dd11e1-9b079095-05c635b0, fileserver5.ecg.mit.edu, masters, ipa,
  etc, ecg.mit.edu
dn: nsuniqueid=17708e04-63dd11e1-9b079095-05c635b0,cn=fileserver5.ecg.mit.edu,
 cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
objectClass: top
objectClass: nsContainer
objectClass: nsTombstone
cn: fileserver5.ecg.mit.edu
nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d

# 5ceb8604-63f211e1-bc108552-1fbf39e2, fileserver5.ecg.mit.edu, masters, ipa,
  etc, ecg.mit.edu
dn: nsuniqueid=5ceb8604-63f211e1-bc108552-1fbf39e2,cn=fileserver5.ecg.mit.edu,
 cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
objectClass: top
objectClass: nsContainer
objectClass: nsTombstone
cn: fileserver5.ecg.mit.edu
nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d

# fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu
dn: cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
cn: fileserver5.ecg.mit.edu
objectClass: top
objectClass: nsContainer

# c480f184-83f011e1-90d1df13-bba55eff, HTTP, fileserver5.ecg.mit.edu, masters
 , ipa, etc, ecg.mit.edu
dn: nsuniqueid=c480f184-83f011e1-90d1df13-bba55eff,cn=HTTP,cn=fileserver5.ecg.
 mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
objectClass: nsContainer
objectClass: ipaConfigObject
objectClass: top
objectClass: nsTombstone
ipaConfigString: enabledService
ipaConfigString: startOrder 40
cn: HTTP
nsParentUniqueId: 1eba8a03-642311e1-9b95afe9-fc1b53ef

# search result
search: 2
result: 0 Success

# numResponses: 6
# numEntries: 5

Is it safe to delete them?

>>>> I also see
>>>> inconsistent replication states on the servers. i.e. server1 shows
>>>> that it's replicating with server2 but server2 does not show that it's
>>>> replicating with server1.
>>>
>>>
>>> Do you have errors in the server2 log showing that it is attempting to
>>> replicate with server1 but failing with some error?
>>
>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver1.ecg.mit.edu
>> Directory Manager password:
>>
>> fileserver2.ecg.mit.edu
>>   last init status: None
>>   last init ended: None
>>   last update status: 0 Replica acquired successfully: Incremental
>> update succeeded
>>   last update ended: 2012-04-13 17:57:39+00:00
>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver2.ecg.mit.edu
>> Directory Manager password:
>>
>> fileserver1.ecg.mit.edu
>>   last init status: None
>>   last init ended: None
>>   last update status: 0 Replica acquired successfully: Incremental
>> update succeeded
>>   last update ended: 2012-04-13 17:57:41+00:00
>> fileserver3.ecg.mit.edu
>>   last init status: None
>>   last init ended: None
>>   last update status: 0 Replica acquired successfully: Incremental
>> update succeeded
>>   last update ended: 2012-04-13 17:57:41+00:00
>> [root at fileserver1 ~]# ipa-csreplica-manage list -v fileserver3.ecg.mit.edu
>> Directory Manager password:
>>
>> fileserver2.ecg.mit.edu
>>   last init status: None
>>   last init ended: None
>>   last update status: 0 Replica acquired successfully: Incremental
>> update succeeded
>>   last update ended: 2012-04-13 17:57:44+00:00
>> fileserver1.ecg.mit.edu
>>   last init status: None
>>   last init ended: None
>>   last update status: 0 Replica acquired successfully: Incremental
>> update succeeded
>>   last update ended: 2012-04-13 17:57:43+00:00
>> [root at fileserver1 ~]#
>>
>> fileserver1's (and fileserver2s) /var/log/dirsrv/slapd-PKI-IPA/errors
>> contains lots of:
>> [13/Apr/2012:13:57:43 -0400] NSMMReplicationPlugin -
>> repl_set_mtn_referrals: could not set referrals for replica o=ipaca:
>> 20
>
>
> This error usually means a replica was deleted and the RUV needs to be
> cleaned.
> see http://port389.org/wiki/Howto:CLEANRUV
> and
> https://fedorahosted.org/freeipa/ticket/2303
> and
> https://fedorahosted.org/389/ticket/337

OK, I've seen this before - is it important to remove them? I've had
to add and remove replicas so much that I don't really want to do it
unless it's necessary. I'm happy to live with them if it's not a
problem.

>> fileserver3's /var/log/dirsrv/slapd-PKI-IPA/errors contains lots of:
>> [13/Apr/2012:13:52:50 -0400] slapi_ldap_bind - Error: could not send
>> startTLS request: error -1 (Can't contact LDAP server) errno 107
>> (Transport endpoint is not connected)
>
>
> This is a real connection error - could be cert or hostname lookup related.

How do I find out if it's cert or hostname lookup? Which hostname?
Fileserver3 runs DNS, and it seems to be working fine.

>> [13/Apr/2012:13:57:39 -0400] NSMMReplicationPlugin -
>> repl_set_mtn_referrals: could not set referrals for replica o=ipaca:
>> 20
>>
>> fileserver2's non-PKI replication agreements to both fileserver1 and 3
>> are in place, but both say: Incremental update has failed and requires
>> administrator actionSystem error.
>
>
>
>> When I try to re-initialize:
>>
>> [root at fileserver2 ~]# ipa-replica-manage re-initialize --from
>> fileserver3.ecg.mit.edu
>> Directory Manager password:
>>
>> [fileserver3.ecg.mit.edu] reports: Replica Busy! Status: [1
>> Replication error acquiring replica: replica busy]
>
>
> This is a transient condition.

Fileserver2 is busy? The /var/log/dirsrv/slapd-ECG-MIT-EDU/errors is
now full of:

[13/Apr/2012:14:59:19 -0400] NSMMReplicationPlugin - conn=1 op=571
csn=4f70a9e5000100060000: Can't created glue entry
cn=fileserver4.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu
uniqueid=6949d104-775b11e1-abce82a1-a45dd3c3, error 68

Should I delete the LDAP entry which is trying to replicate
fileserver2 with fileserver4?

>> this command has been running for 1/2hr and produced no more output
>> (fileserver2 is the remaining server running Fedora 15, the others are
>> Fedora 16 with latest updates).
>
>
> Not sure how ipa-replica-manage handles busy - does it keep trying until it
> is not busy?
>
>
>>
>>>> Is there some way that I can refresh/clean my LDAP directories and
>>>> ensure that everything's running correctly.
>>>
>>> We first need to find out what's going on and why you are seeing these
>>> failures before we can recommend a particular course of action.  There is
>>> currently no "find all of my problems and fix them" command.
>>
>> :) Wish there was. It's just that I've been having lots of problems
>> recently and I was thinking that there is something fundamentally
>> wrong with my installation. I keep having to ask you guys for help.
>
>
> I think some of these problems were due to the fact that an alpha version of
> 389 got pushed to the Stable repo in F-16, and in between that alpha version
> and the real "Stable" version we were forced to change the database format
> to fix a serious issue, and that introduced some inconsistencies into the
> database upon upgrade.

Yeah, I think most of my troubles have started since that version.
Hope I can get it fixed! :)

>> An additional problem, which Rob Crittenden is helping with is that
>> I'm trying to install another replica (fileserver4) which fails when
>> setting up the CA:
>>
>> 2012-04-11 11:30:47,289 CRITICAL failed to configure ca instance
>> Command '/usr/bin/perl /usr/bin/pkisilent 'ConfigureCA' '-cs_hostname'
>> 'fileserver4.ecg.mit.edu' '-cs_port' '9445' '-client_certdb_dir'
>> '/tmp/tmp-JJIkrk' '-client_certdb_pwd' XXXXXXXX '-preop_pin'
>> 'LI1En8UwjZ2BYDcnu8nJ' '-domain_name' 'IPA' '-admin_user' 'admin'
>> '-admin_email' 'root at localhost' '-admin_password' XXXXXXXX
>> '-agent_name' 'ipa-ca-agent' '-agent_key_size' '2048'
>> '-agent_key_type' 'rsa' '-agent_cert_subject'
>> 'CN=ipa-ca-agent,O=ECG.MIT.EDU' '-ldap_host' 'fileserver4.ecg.mit.edu'
>> '-ldap_port' '7389' '-bind_dn' 'cn=Directory Manager' '-bind_password'
>> XXXXXXXX '-base_dn' 'o=ipaca' '-db_name' 'ipaca' '-key_size' '2048'
>> '-key_type' 'rsa' '-key_algorithm' 'SHA256withRSA' '-save_p12' 'true'
>> '-backup_pwd' XXXXXXXX '-subsystem_name' 'pki-cad' '-token_name'
>> 'internal' '-ca_subsystem_cert_subject_name' 'CN=CA
>> Subsystem,O=ECG.MIT.EDU' '-ca_ocsp_cert_subject_name' 'CN=OCSP
>> Subsystem,O=ECG.MIT.EDU' '-ca_server_cert_subject_name'
>> 'CN=fileserver4.ecg.mit.edu,O=ECG.MIT.EDU'
>> '-ca_audit_signing_cert_subject_name' 'CN=CA Audit,O=ECG.MIT.EDU'
>> '-ca_sign_cert_subject_name' 'CN=Certificate Authority,O=ECG.MIT.EDU'
>> '-external' 'false' '-clone' 'true' '-clone_p12_file' 'ca.p12'
>> '-clone_p12_password' XXXXXXXX '-sd_hostname'
>> 'fileserver3.ecg.mit.edu' '-sd_admin_port' '443' '-sd_admin_name'
>> 'admin' '-sd_admin_password' XXXXXXXX '-clone_start_tls' 'true'
>> '-clone_uri' 'https://fileserver3.ecg.mit.edu:443'' returned non-zero
>> exit status 255
>>
>> Sorry to dump a tonne of problems in one go, but you can see why I
>> think there's something (probably several things) badly wrong with my
>> installation. I guess I was looking for a few very basic things to
>> check to ensure that the servers are fundamentally configured
>> properly.
>
>
> Unfortunately, it appears that some of your problems are unexpected and/or
> have not been seen before.

Hopefully I can fix them, as long as you don't mind my endless emails
to the list.... :)

Thanks,

Dan




More information about the Freeipa-users mailing list