[Tendrl-devel] Parallel processing of tasks and events

Mon Jun 5 17:08:13 UTC 2017

Hi Filip,

The way the tasks processing works is as follows:

Summary:
1. Every task(for which I am somehow accustomed to use the term "job" :) ) in tendrl starts with an entry in /queue in etcd.
   Note: 
     a. Currently, this queue is one single/global queue across all integrations of tendrl.
     b. Entries in /queue corresponding to each job(failed, successfully executed or new(not-yet picked)) exist for a definitive interval of time after which they get erased automatically in etcd
        This makes use of ttl feature provided by etcd...
        Note:
            @rohan will be better able to add more details and/or correct me if I am wrong..
     Personally I feel from the perspective of performance, 
     i. It might be better to have per-component/integration specific queue with ttl
    ii. Move the completed and/or failed jobs to some other global queue.
     A combination or at-least one or both of these can prove to be really very useful for bigger deployments...
2. Now execution of any and every job is essentially an execution of a set of atomic operations(atoms in tendrl terminology) preceded(pre) and followed(post) by atoms that
   validate the applicability/suitability/validity as applicable of the job on the tendrl managed resource and validation of success or failure of job execution accordingly..
3. Any job in tendrl is meant to be executed by a specific tendrl-module and can be even targeted to a specific node or even a collection of nodes depending on 
   i. Type of job
  ii. Purpose of job
 iii. User input/choice
   as applicable and with suitable validations and best/recommended perspectives in view.
4. So, given 1,2 and 3 above, tendrl-integrations scan through every entry in /queue of etcd and pick the jobs that are 
   1. marked status new(not finished which indicates the job is already processed and executed or failed which indicates that the job execution failed)
   2. relevant/targeted to them
   and then they validate(pre and post atoms) and execute them..(This is the reason for my thoughts in point 1 above)
   This is in accordance with a match between
   i. Tags tagged to the integration module(some via entries in the integration's conf file and some based on what the integration module is meant for) and
  ii. Tags contained in job specific details in /queue in etcd...
   This 
   Note: 
       a. The job to constituent validation + execution atoms mapping is maintained as part of integration module specific definitions(which is also pushed to etcd in '_NS/<integration_name>/definitions' in etcd
       b. The tendrl object definitions are also maintained in above mentioned location as part of integration-specific definitions
So given 1, 2, 3 and 4, job execution runs parallely between the integration modules unless serialized explicitly by the job only if required/applicable(as per implementation of job)...
But the scan of /queue in etcd essentially is through a possibly huge list...
Atleast from the perspective of monitoring, currently what happens is per node, per collectd plugin jobs are loaded to /queue to generate a collectd configuration file..
To give an idea of what this means, let see an example:

i. Per node that is not part of any cluster, we have memory, cpu, swap, mount-point, disk, dbpush(push stats to graphite), latency which means 7 jobs.
ii. In addition on every gluster node, there will be 4 more plugins : brick_utilization, cluster_utilization, peer_network_throughput and volume_utilization
iii. In addition, every ceph mon node will have 6 more plugins: cluster_iops, cluster_utilization, node_network_throughput, osd_utilization, pool_utilization and rbd_utilization.

which makes me arrive at a maximum of 
i. 7 + 4 = 11 monitoring-specific jobs for a gluster peer.
ii. 7 + 6 = 13 monitoring-specific jobs for a ceph mon node.
iii. 7 + 1 = 8 monitoring specific jobs for a ceph osd node.

Note:
i. These figures are deterministic numbers are not bound to change by no. of resources(ex:bricks or volumes the node takes part in for a gluster node or osds the node contributes for a ceph node) on the node.
ii. These numbers only increase with every operation to alter/override the default threshold values(currently the capability is not exposed)..
iii. The current way of loading monitoring jobs if felt appropriate can be optimized to per node 1(for all resources) or 2(1 for all physical resources  + 1 for all logical resources) jobs...
   But somehow, this idea fails to convince me. And I am in favour of the approach above in Point 1 

Now regarding CephClusterCreate job in particular I am probably not the correct person to comment here @shubhendu can comment/detail out better
The ceph installation + cluster creation is through ceph-installer and as per my observation, from the point the job is in ceph-installer until the ceph bits are installed and actual cluster is created,
the job is probably serial and not parallel(and this is probably not in tendrl's control). This I feel is because of my observation of task detail page of ceph cluster create task as it progresses..
Note:
My comments about CephClusterCreate are just out of my observations and are prone to be incorrect as I have no concrete experience in that area...
@shubhendu @nishanth @rohan would probably be better to comment more on this...

Anmol Babu
Software Engineer 
Red Hat India Pvt Ltd
M: 8884245644

----- Original Message -----
From: "Filip Balak" <fbalak at redhat.com>
To: "Anmol Babu" <anbabu at redhat.com>
Cc: "Mailing list for the contributors to the Tendrl project" <tendrl-devel at redhat.com>
Sent: Monday, June 5, 2017 8:13:01 PM
Subject: Parallel processing of tasks and events

Hi Anmol,

does Tendrl support parallel processing of tasks and events? For example: are job events that install ceph on nodes during CephClusterCreate job installed paralelly? Is importing of two clusters running simultinously or are they waiting on each other?

Best Regards,

Filip Balak