[redhat-lspp] LSPP Development Telecon 04/24/2006 Minutes
George C. Wilson
ltcgcw at us.ibm.com
Tue Apr 25 00:46:39 UTC 2006
These notes were taken by Loulwa Salem and contain minor edits by
George Wilson.
-----------------------
LSPP Meeting 04/24/2006
-----------------------
Known Attendees:
Matt Anderson (HP) - MA
Irina Boverman (RH) - IB
Russell Coker (Red Hat) - RC
Amy Griffis (HP) - AG
Steve Grubb (Red Hat) - SG
Chad Hanson (TCS) - CH
Linda Knippers (HP) - LK
Joy Latten (IBM) - JL
Paul Moore (HP) - PM
Robin Redden (IBM) - RR
Loulwa Salem (IBM) - LS
Lisa Smith (HP) - LMS
Michael Thompson (IBM) - MT
Al Viro (Red Hat) - AV
Dan Walsh (Red Hat) - DW
George Wilson (IBM) - GW
SELinux base update
-------------------
Dan was not on the call yet.
SG: George, he expects to be in the middle of the call.
GW: OK. Was trying to accommodate his leaving on time. Will move
back to middle.
Dan (Later in the call): We have three audit roles now. Everything
seems to be working fairly well, polyinstantiation
and such.
Kernel update
-------------
AV: Not much new. there have been several issues in Amy's patch that
still have to be fixed. That should be about it.
GW: What's the logistics of reworking the patch, you and Amy are
reworking or just you?
AV: I've been working on this stuff . . . but there has been unpleasant
upstream issues. There are some holes in the ext3 file system, there
is a pile of interesting races in the code.
MLS policy and LSPP kernel issues
---------------------------------
GW: Anything else beyond performance and deadlock?
Everyone: Nothing.
Audit performance and stability issues
---------------------------------------
GW: There are three classes of issues: performance, deadlock, and third
is the miscellaneous bugs (ex. first add of watch fails and subsequent
adds works)
>>> Performance:
GW: Fix by splitting into three lists and turn it into a hash table.
How is that going?
LK: No one is working on that now. there is interesting dialogue on
list about it though, but no work is getting done.
GW: Can we get someone to help with this?
AG: I think I have time to look at it this week if that is soon enough
GW: I tried somethings over the weekend and I was concerned about
scalability vs. number of processors on the system. Turning off
syscall entry rules took my performance from 1 to 8 CPUs. We need to
look at scalability across CPUs. This is a general performance issue,
not watch specific. We have the option of creating a syscall table to
resolve this issue.
SG: I don't like that option. we have almost 300 syscall, and to have a
table is not good. There may be several rules that involve more than
one syscall, and that seems it would use too much memory.
GW: I thought that's what LAUS did.
SG: But we are not doing LAUS.
GW: A syscall table seems like a solution, but a hash table works as
well. No one is working on that right now.
>>> Deadlock:
GW: This happens on -mm; Serge created patch on Friday; I fixed it on
Saturday, it fixes deadlock but exposes a problem with buffer filling
up.
SG: Userspace request a list, and get an ACK form kernel, kernel
spawns a thread and gives it to the userspace.
GW: There is a comment in there saying better not fill the buffer or we
get a deadlock, so this seems like it was always there, but it seems
only now it is triggered and the buffer is getting filled. Real
question is why buffer getting filled up. We performed stress tests
in past . . . it didn't happen.
SG: We did a lot of stress tests.
GW: I can do the add, but can't do the delete. I need to understand
kernel side better. I don't think the receive routine is changed, but
it seems to get more data in it than it expects.
SG: We switched from semaphores to mutex.
RC: Might need a spin lock.
AG: In the patch that I worked (with Darrel) for SELinux rules update, the
netlink mutex are moved to a different place in the code, so maybe
that's what is causing us to see this now. We might have to come up
with another way.
GW: Serge's patch queues up semaphores in a list. For a delete case, I
get ENOBUFS. I tried to increased buffer size to no good effect, but I
shouldn't be doing this blindly anyway. I need to understand what's
going on in kernel.
SG: Need to go back and see what changed.
GW: A diff between now and few months ago might be instructive.
AG: Why is there no netlink request that tells kernel to delete everything
it has ?
GW: I agree.
SG: We have lack of cycles from kernel development.
LK: I think doing that would be an optimization.
GW: It's transmitting the buffer to userspace, and with that it's
exposing a hidden cause that we need to look at in detail--why is the
buffer being maxed out? Serge will be in town tomorrow, and I'll talk
with him face to face. He's dedicated to another project and he is
just helping out. Serge's patch fixes deadlock, but exposes the hidden
problem. I'll ask if he wants me to post the patch or not . . .
LK: Posting it will be helpful. Not for inclusion, but at least for us
to see what you are testing.
>>> Miscellaneous:
GW: First auditctl command fails, and appears to do matching on adds by
strlen rather than strcmp.
AG: The file system audit patch didn't make it into the LSPP.18 kernel; once
it is included, the problem with the strlen should go away.
SG: That's another bug--it accepts things that it doesn't understand.
AG: Is the audit patch going to be in the LSPP.19 kernel.
SG: Yes, I'm building one now and it should have the patches. I am
testing the build right now before I push it into Red Hat's build
system. It should be out late tonight.
AuditFS/inotify completion
--------------------------
GW: How is that going?
AG: Not too bad, a bit of hold up. They asked for stress testing, I
still need to finish that up. I am hoping results are good--so
inotify work will be accepted in mm kernel. Have an idea of how
to approach the performance issue and plan to work it when I get
time. However, I have a few things that I'm working on this week
to get base function working first. Here is my ToDo list:
- Remove watched file or dir--no audit records for that; issue
w/implementation.
- Content of audit records--for syscalls that involve adding or
removing things from fs; 2 entities of interest--directory
(container) and entity itself; need to do work to make sure that
both container and entity are audited.
- When there is a path being watched; can add path to audit record;
when watch s through hard link, it is not apparent why you have that
audit record.
- Cleanup involving use of inotify API; when parent of path we're
watching gets removed, that path is no longer interesting; need to
remove all child watches; watches still hanging around.
SG: On the hardlink issue: Why are we not getting records--that's why we
had keys I believe.
AG: If filterkeys are still useful, we can always include that.
SG: Think we removed them to save space.
MA: Instead of the path, or in addition to path?
SG: You should put a key to the file, and when it triggers, you see the
watch.
AG: Seems the path is more meaningful than the key. Unless there is a
reason why we can't put name of file, that should be the way to go.
GW: Sounds like a straightforward approach.
LK: We'll see how it goes. Just to recap, Amy says her priorities are
inotify patch to submit upstream, the race conditions, and implementing
Al's suggestions for performance problems.
Audit of POSIX message queues
-----------------------------
GW: Didn't do much, been looking at deadlock issue. Will get back to
this but it should not be very difficult work.
Audit API
---------
SG: Still back burner issue.
Audit failure action inquiry function
-------------------------------------
GW: Has Lisa made any progress in audit failure inq?
LMS: I am currently learning about audit. I'm writing a proposal and
will send something this week to the mailing list. About 5% done.
Audit of service discontinuity
------------------------------
GW: No one actively working on that. Anyone want to volunteer?
Fail to secure state
--------------------
GW: No one actively working on that. Anyone want to volunteer?
Print
-----
MA: I've been able to get filters in place that does trusted portion and
glues services together. Currently spending a lot of time on packaging,
and working on packages to add the trusted printing portion. Changes
that are needed for CUPS vs. changes required for trusted CUPS, there
will be t-CUPS rpm that will add the functionality, making the main
line CUPS changes minimal. Once rpm interdependencies and patch issues
are resolved, I'll send them around.
CIPSO
-----
GW: Paul, would you like to talk about CIPSO? Meant to add it to the
agenda.
PM: Not a whole lot since last week. Now getting userspace configuration
tool so we can push label mapping into kernel. I broke it and now I'm
trying to fix it. Userspace tool should be done in the next couple of
days. There are 1 or 2 more things to do in the kernel, port patch to
the latest LSPP kernel at the time, quick unit testing and documents to
do then send to list.
GW: Joe and Lenny seem to be interested in it.
IPsec labeling, xinetd, secpeer
------------------------------
GW: Trent's patch does work. Since it is Trent's student who wrote it,
Trent has to post that out to list. Joy provided feedback.
JL: Dan said he included IPsec patch in latest rawhide. I re-sent the
patch this afternoon to ipsec-tools mailing list. I only sent the
portion to do manual keying, which is very short, and hopefully they
review it. If we get that in, maybe next is the dynamic portion.
Nethooks stuff still not sent to community, sorry about that, will work
on that soon. The stress test for nethooks worked very well. I am
currently redoing snapshot of performance to give everyone an idea of
IPsec+nethooks vs without nethooks. I will send the results to the
list.
DW: The patch will be in current rawhide.
GW: Thanks, we will verify it is there and test it.
ipsec-tools patches: Base, SPD dump, and racoon MLS
-----------------------------------------------------
CH: We are cleaning up patches to test on rawhide. Once it's done,
we'll pass it on.
JL: What about the MLS portions for association object class?
CH: We are working on that at same time.
Device allocation
-----------------
CH: Waiting to hear from Dan on that.
DW: It's not used by anything but MLS. We have to figure out what
happens when you install the package and run on system with udev.
Don't want to put in rawhide yet.
CH: We didn't test with hotplugging.
Label translation daemon
------------------------
CH: Very close right now. We don't have policy for it yet; working on it.
DW: I'll write policy for it--just give me the source code.
Self tests
----------
GW: Didn't get back to that. Will do that when I have more bandwidth,
and audit is working a bit better.
VFS polyinstantiation, cron, tmpwatch, etc.
-------------------------------------------
GW: Janak is not on today, but wanted me to bring these issues up:
Please ask Steve Grubb to nudge a couple of people for me.
pam_namespace - The patch is complete, Tomas has made some additional
changes. He is reviewing the final version. Once he is
done, he will put it in rawhide and push it upstream to PAM
maintainers.
multi-context cron - Still no word from cron maintainers. Steve, please
ask Jason Dias to review it and provide feedback. Steve
had also mentioned last week that he knows Paul Vixie
. . .
SG: We can put the patch in. I'll check with Tomas.
DW: Talked to Tomas on Thursday and he said they were reviewing the
patch.
SG: Will ping Jason. I have no special line on Paul. Please send to Paul
directly. He is busy.
Remaining tasks, target date, etc.
----------------------------------
A discussion in the beginning of call about whether to move Dan's section
(SELinux base update) to mid part of meeting. It was decided to move it to
mid-meeting since Dan joins the call a bit later.
GW: There are still items that need owners. we want to wrap up. we need to
move on to test and documentation.
SG: Anyone that has kernel pieces that are still not merged, we really need
to get it in the 2.6.18. We need to get them in -mm, and get dome run
time. It is very important! Userspace has a little more slack. But
for the kernel patches, it is important it gets in that kernel.
GW: We are behind on schedule, and this is causing a sort of a ripple
effect. We still need to shut down development. We need to complete
the work so people can start transitioning. If you have free time, you
should feel guilty and need to help out.
Joy volunteered to do some work. maybe the performance issue. still need
to dig in and understand the code, but is willing work on it.
--
George Wilson <ltcgcw at us.ibm.com>
IBM Linux Technology Center
More information about the redhat-lspp
mailing list