Networkmanager service is shutdown too early
Simo Sorce
ssorce at redhat.com
Sun Jun 1 23:07:32 UTC 2008
On Sun, 2008-06-01 at 11:49 -0400, Colin Walters wrote:
> On Sun, Jun 1, 2008 at 11:02 AM, Simo Sorce <ssorce at redhat.com> wrote:
>
> So far we can only consiuder DBUS as a sort of local UDP
> transport, if
> all goes well messages get to their destination but are not
> guaranteed.
>
> The argument here is that presently what we tell application authors
> is much more like TCP than UDP; if we allowed distributors to restart
> it in %post or the like automatically on upgrades, then we either have
> to change our guarantee, or try to "hide" the fact that the bus gets
> restarted under the covers.
>
> I think the only sensible solution is the latter. Which is certainly
> *possible*, just like how everything short of the halting problem is
> possible; but it would not be trivial.
Yes I agree, handling the restart inside dbus libraries is the best
choice.
> For many likely classes of DBus flaws, porting the Ksplice
> (http://web.mit.edu/ksplice/) style approach would be easiest
> probably.
To work you would have to separate very clearly functions from data
structures, standardize the latter, hanging them off the main code and
put the former into a loadable library, then you might be able to get to
a state where you can dlclose() and dlopen() again the library and iot
all just works.
This, in theory, I never saw code clean enough to work this way, but
usually this is just because it does not need to. :-)
> But to handle the general case, I can imagine a system where we send
> a special message to all clients like org.freedesktop.Local.Restart
> and this causes them to enter a mode where they queue pending
> messages, waiting via inotify for the socket to reappear.
Yes this would be a decent solution too, although this would require
some handling on the client apps.
> The bus itself would try to flush all pending messages and save the
> current map of connections->service names and other state I'm not
> thinking of right now to JSON/XML/whatever.
>
> Then on startup you'd need to wait for all of the previous clients to
> connect, probably with some timeout; I can't think of offhand how to
> make this nrandomon-racy. After that we need to handle anything that
> changed in the meantime like clients having exited and thus lost their
> service name (this will happen for sure if we make other software
> restart on upgrade like setroubleshoot does). So we compute that
> delta and then send the relevant signals off to clients.
yup
> For someone who knew the code and was an A+ hacker it might only be a
> two week or so job, though to actually know this worked you'd have to
> spend a lot of time creating test cases.
it may be tricky, but with a good test suite that insure this works it
would be really worth it.
> What was the cost/benefit analysis in this case?
>
> The original cost/benefit was "Absolutely nothing happens when I put
> my USB key into a Linux desktop" and "The networking system is a
> static mess of shell script that we edit via UIs run as root" =)
eh.. :-)
> Given some people is thinking of using NM by default also on
> servers
> then this issue become more critical, servers do serve
> clients,
>
> Let's back up a second; if our overall goal is to make applying
> security/important-reliability updates happen more transparently, I
> think the best bang for the buck is going to be Linux. For example,
> we could spend the engineering time figuring out how to get Ksplice
> (http://web.mit.edu/ksplice/) work under the RPM hood.
>
> DBus has so far had a pretty good security and reliability track
> record; while it's not simple software, it has simple goals and this
> has limited complexity. Something like the Linux kernel clearly has a
> much bigger goal and so is order(s) of magnitude more complex and with
> this complexity has come the concomitant security/reliability issues.
>
> And if I had the ability to herd security/reliability cats, I'd have
> them spend time on Firefox and try to take what Dan Walsh has been
> doing even farther - break it up into multiple processes with locked
> down security contexts and evaluate changes to the desktop to better
> handle the concept of processes with different privilege for example.
Security updates is just one aspect imo, reliability and self-repairing
with minimal service disruption is another very important goal.
Simo.
--
Simo Sorce * Red Hat, Inc * New York
More information about the fedora-devel-list
mailing list