[Freeipa-users] Antwort: Re: Haunted servers?

thierry bordaz tbordaz at redhat.com
Fri May 29 07:41:28 UTC 2015


On 05/29/2015 08:16 AM, Christoph Kaminski wrote:
> freeipa-users-bounces at redhat.com schrieb am 28.05.2015 13:23:26:
>
> > Von: Alexander Frolushkin <Alexander.Frolushkin at megafon.ru>
> > An: "'thierry bordaz'" <tbordaz at redhat.com>
> > Kopie: "freeipa-users at redhat.com" <freeipa-users at redhat.com>
> > Datum: 28.05.2015 13:24
> > Betreff: Re: [Freeipa-users] Haunted servers?
> > Gesendet von: freeipa-users-bounces at redhat.com
> >
> > Unfortunately, after a couple of minutes, on two of three servers
> > error comes back in little changed form:
> > # ipa-replica-manage list-ruv
> > unable to decode: {replica 16}
> > ....
> >
> > Before cleanruv it looked like:
> > # ipa-replica-manage list-ruv
> > unable to decode: {replica 16} 548a8126000000100000 548a8126000000100000
> > ....
> >
> > And one server seems to be fixed completely.
> >
> > WBR,
> > Alexander Frolushkin
> >
> >
>
> we had the same problem (and some more) and yesterday we have 
> successfully cleaned the gohst rid's
>
> our fix:

Hi Christoph,

THanks for sharing this procedure. This bug is difficult to workaround 
and that is a good idea to write it down.

>
> 1. stop all cleanallruv Tasks, if it works with ipa-replica-manage 
> abort-clean-ruv. It hasnt worked here. We have done it manually on ALL 
> replicas with:
>         a) replica stop
>         b) delete all nsds5ReplicaClean from 
> /etc/dirsrv/slapd-HSO/dse.ldif
>         c) replica start
>
Yes the ability to abort clean ruv hits the same retry issue that 
cleanallruv. It has been addressed with 
https://fedorahosted.org/389/ticket/48154
> 2. prepare on EACH ipa a cleanruv ldif file with ALL ghost rids inside 
> (really ALL from all ipa replicas, we has had some rids only on some 
> replicas...)
> Example:
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV11
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV22
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV37
> ...

It should work but I would prefer to do it in an other order.
We need to clean a specific RID, on all replica, at the same time. We do 
not need to clean all RIDs at the same time.
Having several CLEANRUV in parallel for differents RID should work but I 
do not know how much it has been tested that way.

So I would recommend to clean, in parallel on all replicas, RID 11. Then 
when it is completed, RID 22. Then RID 37.

>
> 3. do a "ldapmodify -h 127.0.0.1 -D "cn=Directory Manager" -W -x -f 
> $your-cleanruv-file.ldif" on all replicas AT THE SAME TIME :) we used 
> terminator  for it (https://launchpad.net/terminator). You can open 
> multiple shell windows inside one window and send to all at the same 
> time the same commands...

same remark I would split your-cleanruv-file.ldif into three files 
cleanruv-11-file.ldif,...
>
> 4. we have done a re-initialize of each IPA from our first master

Do you mean a total init ? I do not see a real need for that.
If you are ready to reinit all replicas, there is a very simple way to 
get rid of all these ghost RIDs.

  * Select the "good" master that is having all the updates
  * do a ldif export without the replication data
  * do a ldif import of exported file
  * do online reinit of the full topology, cascading from the "good"
    master down to the "consumers"

Most of the time we try to avoid asking a full reinit of the topology 
because DB are large.

>
> 5. restart of all replicas
>
> we are not sure about the point 3 and 4. Maybe they are not necessary, 
> but we have done it.
>
> If something fails look at defect LDAP entries in whole ldap, we have 
> had some entries with 'nsunique-$HASH' after the 'normal' name. We 
> have deleted them.
do you mean entries with 'nsuniqueid' attribute in the RDN. This could 
be create during replication conflicts when updates are received in 
parallele on different replicas.


thanks
thierry
>
> MfG
> Christoph Kaminski
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/freeipa-users/attachments/20150529/b2137537/attachment.htm>


More information about the Freeipa-users mailing list