Error: setroubleshootd dead but subsys locked

Tue Sep 18 16:36:27 UTC 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Steven Stromer wrote:
>>> Had a strange, and as yet unexplained, 'event' (I wasn't in front of the
>>> machine when things went weird) that took place while a system was left
>>> running a large rsync over ssh. On returning, a majority of the
>>> directories under /var vanished, and a number of services refused to
>>> start after a reboot, including auditd, nfsd, system message bus, hpiod,
>>> hpssd, mysql, syslogd, httpd, sm-client, and setroubleshootd.
>>>
>>> In the cases of most of these services, there seemed to be problems
>>> either with orphaned /var/run/*.pid files, or with orphaned
>>> /var/lock/subsys/* lock files. Also, many services were reporting
>>> 'subsys locked'. Deleting orphaned files, followed by relabeling the
>>> filesystem selinux permissions did the trick, with relabeling being the
>>> key to getting things going again. Debugging was made more challenging
>>> by the fact that I had no logs to refer to.
>>>
>>> Now, almost all seems well, but I can't get setroubleshootd to start
>>> unless I select 'setroubleshootd_disable_trans'. Without this checked,
>>> setroubleshootd seems to start, but then fails:
>>>
>>> [root at file1 subsys]# rm setroubleshootd
>>> rm: remove regular empty file `setroubleshootd'? y
>>> [root at file1 subsys]# service setroubleshoot status
>>> setroubleshootd is stopped
>>> [root at file1 subsys]# service setroubleshoot start
>>> Starting setroubleshootd:                                  [  OK  ]
>>> [root at file1 subsys]# service setroubleshoot status
>>> setroubleshootd dead but subsys locked
>>>
>>>
>>> Attempting to run setroubleshoot generates the error:
>>>
>>> 'attempt to open server connection failed: (2, 'No such file or
>>> directory')
>>>
>>>
>>> Since someone might ask about permissions:
>>>
>>> [root at file1 subsys]# ls -laRZ /var/log | grep setroubleshoot
>>> drwxr-xr-x  root  root  system_u:object_r:setroubleshoot_var_log_t
>>> setroubleshoot
>>> /var/log/setroubleshoot:
>>> drwxr-xr-x  root root system_u:object_r:setroubleshoot_var_log_t .
>>> -rw-r--r--  root root system_u:object_r:setroubleshoot_var_log_t
>>> setroubleshootd.log
>>> -rw-r--r--  root root system_u:object_r:setroubleshoot_var_log_t
>>> setroubleshootd.log.1
>>> -rw-r--r--  root root system_u:object_r:setroubleshoot_var_log_t
>>> setroubleshootd.log.2
>>>
>>>
>>> Can anyone explain why setroubleshootd_disable_trans should need to be
>>> selected? Also, since this entire event seems to have close ties to
>>> selinux, would anyone have an idea what might have happened to this
>>> system?
>>>
>>>
>>> Thanks for any ideas; it's been a long day...
>>>
>>> Steven Stromer
>>
>> You didn't say what OS version you're running :-) This looks a lot like
>> known problems in rawhide (fedora development). If you are running
>> rawhide then do you have the latest selinux-policy rpm install? The
>> latest audit?
>>
> 
> Thanks for the reply. I am running FC6, 2.6.22.4-45.fc6. I'm on the
> standard FC6 path, not development, though I'd be really interested to
> see any documentation regarding the 'known problems in rawhide'. Unless
> the system faulted and restarted, activating package updates that had
> not yet witnessed a reboot, I can't see how any updates were applied. As
> far as policy and audit packages, I have:
> 
> selinux-policy.noarch                    2.4.6-80.fc6           installed
> selinux-policy-targeted.noarch           2.4.6-80.fc6           installed
> audit.i386                               1.4.2-5.fc6            installed
> audit-libs.i386                          1.4.2-5.fc6            installed
> audit-libs-python.i386                   1.4.2-5.fc6            installed
> 
>> If setroubleshoot still does not start please look for errors
>> in /var/log/setroubleshoot/setroubleshootd.log
> 
> At present setroubleshootd logs are entirely empty. /var/logs was wiped
> during the 'event' and my backups of these files were also empty.
> 
>>
>> BTW, setroubleshoot failing to start will not harm your system in any
>> manner nor would it likely to have been the cause of any of your
>> previous problems.
> 
> This I know. it is the last little consequence of a much larger issue. I
> am honestly more concerned with why so many directories and files
> disappeared from /var (despite the fact that I have no disk errors) and
> why selinux permissions had to be changed to get things that were
> working previously to be able to work again. Any further leads would be
> VERY much appreciated!
> 
>> -- 
>> John Dennis <jdennis at redhat.com>
> 
> -- 
> fedora-selinux-list mailing list
> fedora-selinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/fedora-selinux-list
This sounds a lot like a labeling problem.  Since you recreated all the
directories under /var, you might not have labeled them correctly.  You
can relabel them by executing restorecon -R -v /var or you can relabel
the entire system by executing  touch /.autorelabel; reboot
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFG7/6LrlYvE4MpobMRAlkfAJ9/kOFoCJrHIQY8q01wecpunX2IOACdFbmc
65rle/j9PUryAIIHVe0Lgxs=
=ChHO
-----END PGP SIGNATURE-----