[Open-scap] RPM based remediation hangs on SUSE.

Šimon Lukašík slukasik at redhat.com
Fri Apr 1 15:07:35 UTC 2016


Hello Gautam,

Firstly, let me just express gratitude for in depth analysis. I applaud
to engineering virtue. Well done!

I would approach this thing as follows. I would connect with RPM
maintainers in SuSE, why it needs to get exclusive lock.

I am not equipped to claim which solution (rpm in RHEL, vs. rpm in SuSE)
is "better". However, having the lesser constraint seems better until
there is risk for data corruption. I have been QE of RPM in RHEL many
years ago, and I don't remember any issues with weak locks.


Then, for OpenSCAP, we run the probe any time it is needed and we stop
it only at the scan completion. That has performance benefits (I still
have to see scanner that performs quicker than OpenSCAP :)). And then
each probe holds cache of things that have been already queried. So,
killing the probe is out of question in general.

However, in this specific case. We already drop all the cache when we do
remediation (because cache would still return pre-remediated  values).
So perhaps you can write code to stop all probes before remediation starts.

The clean of the cache is here
https://github.com/OpenSCAP/openscap/blob/78c8706d961270f1878d0639bbceee3f3fb7623f/src/XCCDF/xccdf_session.c#L1413

Good luck!
~š.


On 03/31/2016 08:32 AM, S, Gautam wrote:
> Hello folks,
> 
>  
> 
> I am looking at a rather obscure issue related to the way the RPM
> library behaves on SUSE. While trying to run remediation for a rule
> related to RPMs, the process hangs until it finally times out.
> 
>  
> 
> # oscap xccdf eval --profile test --remediate sles11-xccdf.xml
> 
> Title   Uninstall bind Package
> 
> Rule    package_bind_removed
> 
> Ident   CCE-27030-6
> 
> Result  fail
> 
>  
> 
>  
> 
> --- Starting Remediation ---
> 
>  
> 
> Title   Uninstall bind Package
> 
> Rule    package_bind_removed
> 
> Ident   CCE-27030-6
> 
>  
> 
> <<Hangs for a while here>>
> 
>  
> 
> Result  error
> 
>  
> 
> I have collected openscap verbose logs and RPM logs and it looks like
> there is a deadlock.
> 
>  
> 
> From RPM verbose mode logs while the fix is running the “rpm –e” operation:
> 
>  
> 
> D: opening  db environment /var/lib/rpm/Packages create:cdb:mpool:private
> 
> D: opening  db index       /var/lib/rpm/Packages create mode=0x42
> 
> warning: waiting for exclusive lock on /var/lib/rpm/Packages
> 
> error: cannot get exclusive lock on /var/lib/rpm/Packages
> 
> D: closed   db index       /var/lib/rpm/Packages
> 
> D: closed   db environment /var/lib/rpm/Packages
> 
> error: cannot open Packages index using db3 - Operation not permitted (1)
> 
>  
> 
> Using strace, we can see that it attempts to acquire this WR lock
> multiple times until it finally times out and returns failure:
> 
>  
> 
> fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1
> EAGAIN (Resource temporarily unavailable)
> 
>  
> 
> RPM does not get exclusive mode lock because Openscap seems to be
> acquiring the lock in read mode:
> 
>  
> 
> # lsof /var/lib/rpm/Packages [Command run while the hang is seen]
> 
> COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
> 
> probe_rpm 16875 root    3rR  REG    8,2 53420032 352263
> /var/lib/rpm/Packages
> 
> rpm       16891 root    3u   REG    8,2 53420032 352263
> /var/lib/rpm/Packages
> 
>  
> 
> # ps –a [Command run while the hang is seen]
> 
>   PID TTY          TIME CMD
> 
> 16859 pts/0    00:00:00 oscap
> 
> 16865 pts/0    00:00:00 probe_system_in
> 
> 16870 pts/0    00:00:00 probe_family
> 
> 16875 pts/0    00:00:00 probe_rpminfo
> 
> 16885 pts/0    00:00:00 probe_system_in
> 
> 16890 pts/0    00:00:00 bash
> 
> 16891 pts/0    00:00:00 rpm
> 
> 16934 pts/2    00:00:00 ps
> 
> #
> 
>  
> 
> The RPM library call made in function
> src/OVAL/probes/unix/linux/rpminfo.c:probe_init, rpmtsCreate() and the
> subsequent rpmtsInitIterator() result in a read lock being taken on the
> file /var/lib/rpm/Packages.
> 
> This read lock seems to be released when rpmtsFree() call is made from
> probe_fini.
> 
>  
> 
> The particular probe_rpminfo in question is not related to the OVAL
> check of the rule but related to the CPE platform check.
> 
>  
> 
> From the additional traces I added to openscap:
> 
> D: probe_rpminfo: ("seap.msg" ":id" 0 (("rpminfo_object" ":id"
> "oval:org.open-scap.cpe.sles-release:obj:1" ":oval_version" "5.10.1" )
> (("name" ":operation" 5 ":var_check" 1 ) "sles-release" ) ) )
> [probe_rpminfo(7045):Thream
> Name(7f901b36c700):seap-packet.c:904:SEAP_packet_recv]
> 
>> 
> I: probe_rpminfo: gautam : probe_init: Init 1 rpmts
> [probe_rpminfo(7050):Thream Name(7f45fcd527c0):rpminfo.c:325:probe_init]
> 
> <<Fix runs between the rpmts create and free .i.e lock acquire and release>>
> 
> I: probe_rpminfo: gautam : probe_fini: Free 1 rpmts
> [probe_rpminfo(7045):Thream Name(7f90215a37c0):rpminfo.c:343:probe_fini]
> 
>  
> 
> Probe_fini is called in this thread only upon receiving SIGTERM.
> 
>  
> 
> *_Why does this work on RHEL?_*
> 
> On RHEL, the implementation of the RPM command does not seem to be
> trying to acquire an exclusive mode lock, it also locks in read mode only:
> 
>  
> 
> D: opening  db environment /var/lib/rpm cdb:mpool:joinenv
> 
> D: opening  db index       /var/lib/rpm/Packages create mode=0x42
> 
> D: sanity checking 1 elements                                     >> No
> message about lock acquisition here!
> 
> D: running pre-transaction scripts
> 
>  
> 
> # lsof /var/lib/rpm/Packages
> 
> COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
> 
> probe_rpm 33539 root    3rR  REG    8,3 75472896 1048584
> /var/lib/rpm/Packages
> 
> rpm       33555 root    3uR  REG    8,3 75472896 1048584
> /var/lib/rpm/Packages
> 
>  
> 
> Is there some design document that explains how openscap spawns the CPE
> platform check? I am trying to understand if the probe can relinquish
> the lock by freeing the RPMTs structure before executing the remediation
> script.
> 
>  
> 
> Thank you.
> 
>  
> 
> Regards,
> 
> Gautam.
> 
> 
> 
> _______________________________________________
> Open-scap-list mailing list
> Open-scap-list at redhat.com
> https://www.redhat.com/mailman/listinfo/open-scap-list
> 


~š.




More information about the Open-scap-list mailing list