[Open-scap] RPM based remediation hangs on SUSE.

Šimon Lukašík slukasik at redhat.com
Fri Apr 1 15:16:39 UTC 2016


On 04/01/2016 05:07 PM, Šimon Lukašík wrote:
> Hello Gautam,
> 
> Firstly, let me just express gratitude for in depth analysis. I applaud
> to engineering virtue. Well done!
> 
> I would approach this thing as follows. I would connect with RPM
> maintainers in SuSE, why it needs to get exclusive lock.
> 
> I am not equipped to claim which solution (rpm in RHEL, vs. rpm in SuSE)
> is "better". However, having the lesser constraint seems better until
> there is risk for data corruption. I have been QE of RPM in RHEL many
> years ago, and I don't remember any issues with weak locks.
> 
> 
> Then, for OpenSCAP, we run the probe any time it is needed and we stop
> it only at the scan completion. That has performance benefits (I still
> have to see scanner that performs quicker than OpenSCAP :)). And then
> each probe holds cache of things that have been already queried. So,
> killing the probe is out of question in general.
> 
> However, in this specific case. We already drop all the cache when we do
> remediation (because cache would still return pre-remediated  values).
> So perhaps you can write code to stop all probes before remediation starts.
> 

Ah, this paragraph is void.

The killing is already done. :)

The problem is that when we decide whether the specific <fix> element is
applicable, we run the rpminfo probe again.
https://github.com/OpenSCAP/openscap/blob/78c8706d961270f1878d0639bbceee3f3fb7623f/src/XCCDF_POLICY/xccdf_policy_remediate.c#L135


So, perhaps you can create SuSE specific patch to rpminfo probe to
return the lock after each request&response?

Best,
~š.

> The clean of the cache is here
> https://github.com/OpenSCAP/openscap/blob/78c8706d961270f1878d0639bbceee3f3fb7623f/src/XCCDF/xccdf_session.c#L1413
> 
> Good luck!
> ~š.
> 
> 
> On 03/31/2016 08:32 AM, S, Gautam wrote:
>> Hello folks,
>>
>>  
>>
>> I am looking at a rather obscure issue related to the way the RPM
>> library behaves on SUSE. While trying to run remediation for a rule
>> related to RPMs, the process hangs until it finally times out.
>>
>>  
>>
>> # oscap xccdf eval --profile test --remediate sles11-xccdf.xml
>>
>> Title   Uninstall bind Package
>>
>> Rule    package_bind_removed
>>
>> Ident   CCE-27030-6
>>
>> Result  fail
>>
>>  
>>
>>  
>>
>> --- Starting Remediation ---
>>
>>  
>>
>> Title   Uninstall bind Package
>>
>> Rule    package_bind_removed
>>
>> Ident   CCE-27030-6
>>
>>  
>>
>> <<Hangs for a while here>>
>>
>>  
>>
>> Result  error
>>
>>  
>>
>> I have collected openscap verbose logs and RPM logs and it looks like
>> there is a deadlock.
>>
>>  
>>
>> From RPM verbose mode logs while the fix is running the “rpm –e” operation:
>>
>>  
>>
>> D: opening  db environment /var/lib/rpm/Packages create:cdb:mpool:private
>>
>> D: opening  db index       /var/lib/rpm/Packages create mode=0x42
>>
>> warning: waiting for exclusive lock on /var/lib/rpm/Packages
>>
>> error: cannot get exclusive lock on /var/lib/rpm/Packages
>>
>> D: closed   db index       /var/lib/rpm/Packages
>>
>> D: closed   db environment /var/lib/rpm/Packages
>>
>> error: cannot open Packages index using db3 - Operation not permitted (1)
>>
>>  
>>
>> Using strace, we can see that it attempts to acquire this WR lock
>> multiple times until it finally times out and returns failure:
>>
>>  
>>
>> fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1
>> EAGAIN (Resource temporarily unavailable)
>>
>>  
>>
>> RPM does not get exclusive mode lock because Openscap seems to be
>> acquiring the lock in read mode:
>>
>>  
>>
>> # lsof /var/lib/rpm/Packages [Command run while the hang is seen]
>>
>> COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
>>
>> probe_rpm 16875 root    3rR  REG    8,2 53420032 352263
>> /var/lib/rpm/Packages
>>
>> rpm       16891 root    3u   REG    8,2 53420032 352263
>> /var/lib/rpm/Packages
>>
>>  
>>
>> # ps –a [Command run while the hang is seen]
>>
>>   PID TTY          TIME CMD
>>
>> 16859 pts/0    00:00:00 oscap
>>
>> 16865 pts/0    00:00:00 probe_system_in
>>
>> 16870 pts/0    00:00:00 probe_family
>>
>> 16875 pts/0    00:00:00 probe_rpminfo
>>
>> 16885 pts/0    00:00:00 probe_system_in
>>
>> 16890 pts/0    00:00:00 bash
>>
>> 16891 pts/0    00:00:00 rpm
>>
>> 16934 pts/2    00:00:00 ps
>>
>> #
>>
>>  
>>
>> The RPM library call made in function
>> src/OVAL/probes/unix/linux/rpminfo.c:probe_init, rpmtsCreate() and the
>> subsequent rpmtsInitIterator() result in a read lock being taken on the
>> file /var/lib/rpm/Packages.
>>
>> This read lock seems to be released when rpmtsFree() call is made from
>> probe_fini.
>>
>>  
>>
>> The particular probe_rpminfo in question is not related to the OVAL
>> check of the rule but related to the CPE platform check.
>>
>>  
>>
>> From the additional traces I added to openscap:
>>
>> D: probe_rpminfo: ("seap.msg" ":id" 0 (("rpminfo_object" ":id"
>> "oval:org.open-scap.cpe.sles-release:obj:1" ":oval_version" "5.10.1" )
>> (("name" ":operation" 5 ":var_check" 1 ) "sles-release" ) ) )
>> [probe_rpminfo(7045):Thream
>> Name(7f901b36c700):seap-packet.c:904:SEAP_packet_recv]
>>
>>>>
>> I: probe_rpminfo: gautam : probe_init: Init 1 rpmts
>> [probe_rpminfo(7050):Thream Name(7f45fcd527c0):rpminfo.c:325:probe_init]
>>
>> <<Fix runs between the rpmts create and free .i.e lock acquire and release>>
>>
>> I: probe_rpminfo: gautam : probe_fini: Free 1 rpmts
>> [probe_rpminfo(7045):Thream Name(7f90215a37c0):rpminfo.c:343:probe_fini]
>>
>>  
>>
>> Probe_fini is called in this thread only upon receiving SIGTERM.
>>
>>  
>>
>> *_Why does this work on RHEL?_*
>>
>> On RHEL, the implementation of the RPM command does not seem to be
>> trying to acquire an exclusive mode lock, it also locks in read mode only:
>>
>>  
>>
>> D: opening  db environment /var/lib/rpm cdb:mpool:joinenv
>>
>> D: opening  db index       /var/lib/rpm/Packages create mode=0x42
>>
>> D: sanity checking 1 elements                                     >> No
>> message about lock acquisition here!
>>
>> D: running pre-transaction scripts
>>
>>  
>>
>> # lsof /var/lib/rpm/Packages
>>
>> COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
>>
>> probe_rpm 33539 root    3rR  REG    8,3 75472896 1048584
>> /var/lib/rpm/Packages
>>
>> rpm       33555 root    3uR  REG    8,3 75472896 1048584
>> /var/lib/rpm/Packages
>>
>>  
>>
>> Is there some design document that explains how openscap spawns the CPE
>> platform check? I am trying to understand if the probe can relinquish
>> the lock by freeing the RPMTs structure before executing the remediation
>> script.
>>
>>  
>>
>> Thank you.
>>
>>  
>>
>> Regards,
>>
>> Gautam.
>>
>>
>>
>> _______________________________________________
>> Open-scap-list mailing list
>> Open-scap-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/open-scap-list
>>
> 
> 
> ~š.
> 
> _______________________________________________
> Open-scap-list mailing list
> Open-scap-list at redhat.com
> https://www.redhat.com/mailman/listinfo/open-scap-list
> 


~š.




More information about the Open-scap-list mailing list