[Pulp-list] Code coupling, concurreny, and task scheduling

Jason L Connor jconnor at redhat.com
Fri Mar 25 16:30:24 UTC 2011


All Developers,

One problem that keeps recurring in our code base is tight coupling
between objects. We have had this problem before in enough places to
know that it almost always causes problems. Coupling make coding faster
at the time, but it makes testing and future modifications both
nightmares.

The latest problem is in the pulp server's async tasking subsystem: I've
been adding scheduling to tasks and the task queue. However, folks have
implemented derived classes of Task and have coupled them to the global
task queue instance. This means that there is little or no opportunity
to add scheduling to these tasks. I'll show you what I mean:

The AgentTask class:

class AgentTask(AsyncTask):
    ....
    def enqueue(self, unique=False):
        ....
        if _queue.enqueue(self, unique):
            return self

A task should not know how to enqueue itself in a task queue instance. A
task queue should know how to enqueue a task. This isn't disastrous in
itself, but when combined with the InstallPackages class:

class InstallPackages(AgentTask):
    ....
    def __init__(self, consumerid, packages, errata=(), reboot_suggested=False, assumeyes=False):
        ....
        self.enqueue()

We end up with a constructor that has side-effects based on the tight
coupling with the task queue. THERE IS NO OPPORTUNITY TO SCHEDULE THIS
TASK! The task will get enqueued and possibly even run before we can add
a scheduler to it.

Even the existing scheduling is not thread safe. The task is enqueued
long before its scheduled_time field gets set. Meaning the task queue is
sorted with a scheduled_time value of None and the task will potentially
get run immediately if the dispatcher wakes up and runs between the
enqueueing and setting of the scheduled_time field. This is simply
broken.

Two things central to OO mantra are being disregarded here:
     1. No tight coupling. The AgentTask class knows about a specific
        instance of the TaskQueue class. If it's not a field, it
        shouldn't know about it.
     2. No side-effects in constructors. Constructors are for memory
        allocation and initialization, not executing code.

I will have to refactor all of this in order to (1) fix the current
scheduling and (2) port to the new scheduling.

Lessons to take away:
      * Coupling is bad: This OO 101, and we should all know better.
        It's bitten us several times now.
      * Concurrency is hard: Any time you're touching code that does
        concurrency, make sure to get it reviewed by another pair of
        eyes.

-- 
Jason L Connor
linear on freenode #pulp
http://pulpproject.org/
RHCE: 805010912355231
GPG Fingerprint: 2048R/CC4ED7C1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20110325/282757be/attachment.sig>


More information about the Pulp-list mailing list