<div dir="auto">The performance is not great because of "rdo-ci-slave01" from which Ansible runs on.<div dir="auto"><br></div><div dir="auto">We all know that node has performance problems (especially i/o).</div><div dir="auto">For example, a promote job [1] will take 1 hour and 4 minutes while the equivalent generic job [2] (ran on a cloudslave) will finish in about 35 minutes.</div><div dir="auto"><br></div><div dir="auto">I mean, it takes rdo-ci-slave01 more than five (5!) minutes to just bootstrap the job (clone weirdo, virtualenv with ara, ansible, shade and initialize ara).</div><div dir="auto">The same thing takes less than 30 seconds on a cloudslave.</div><div dir="auto"><br></div><div dir="auto">[1]: <a href="https://ci.centos.org/job/weirdo-master-promote-packstack-scenario001/1080/">https://ci.centos.org/job/weirdo-master-promote-packstack-scenario001/1080/</a></div><div dir="auto">[2]: <a href="https://ci.centos.org/view/rdo/view/weirdo/job/weirdo-generic-packstack-scenario001/515/">https://ci.centos.org/view/rdo/view/weirdo/job/weirdo-generic-packstack-scenario001/515/</a></div><div dir="auto"><br><div data-smartmail="gmail_signature" dir="auto">David Moreau Simard<br>Senior Software Engineer | Openstack RDO<br><br>dmsimard = [irc, github, twitter]</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Apr 21, 2017 4:22 AM, "Alfredo Moralejo Alonso" <<a href="mailto:amoralej@redhat.com">amoralej@redhat.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Apr 21, 2017 at 2:40 AM, David Moreau Simard <<a href="mailto:dms@redhat.com">dms@redhat.com</a>> wrote:<br> > WeIRDO jobs were tested manually on the rdo-ci-slave01 (promote slave)<br> > on which the jobs would not run successfully yesterday.<br> ><br> > Everything now looks good after untangling the update issue from<br> > yesterday and WeIRDO promote jobs have been switched to rdo-cloud.<br> ><br> <br> Nice!, I've seen weirdo jobs in<br> <a href="https://ci.centos.org/view/rdo/view/promotion-pipeline/job/rdo_trunk-promote-master-current-tripleo/44/" rel="noreferrer" target="_blank">https://ci.centos.org/view/<wbr>rdo/view/promotion-pipeline/<wbr>job/rdo_trunk-promote-master-<wbr>current-tripleo/44/</a><br> ran in RDO Cloud with pretty good performance, they seems to run<br> slower than jobs running in dusty servers in ci.centos but faster that<br> the rest of servers.<br> <br> I'll keep an eye on it too to find out if there is any abnormal behavior.<br> <br> <br> > I'll be monitoring this closely but let me know if you see any problems.<br> ><br> > David Moreau Simard<br> > Senior Software Engineer | Openstack RDO<br> ><br> > dmsimard = [irc, github, twitter]<br> ><br> ><br> > On Thu, Apr 20, 2017 at 12:26 AM, David Moreau Simard <<a href="mailto:dms@redhat.com">dms@redhat.com</a>> wrote:<br> >> Hi,<br> >><br> >> There's been a few updates worth mentioning and explaining to a wider<br> >> audience as far as RDO is concerned on the <a href="http://ci.centos.org" rel="noreferrer" target="_blank">ci.centos.org</a> environment.<br> >><br> >> First, please note that all packages on the five RDO slaves have been<br> >> updated to the latest version.<br> >> We had not yet updated to 7.3.<br> >><br> >> The rdo-ci-slave01 node (the "promotion" slave) ran into some issues<br> >> that took some time to fix, EPEL was enabled and it picked up python<br> >> packages it shouldn't have.<br> >> Things seem to be back in order now but some jobs might have failed in<br> >> a weird way, triggering them again should be fine.<br> >><br> >> Otherwise, all generic WeIRDO jobs are now running on OpenStack<br> >> virtual machines provided by the RDO Cloud.<br> >> This is provided by using the "rdo-virtualized" slave tags.<br> >> The "rdo-promote-virtualized" tag will be used for the weirdo promote<br> >> jobs once we're sure there's no more issues running them on the<br> >> promotion slave.<br> >><br> >> These tags are designed to work with WeIRDO jobs only for the time<br> >> being, please contact me if you'd like to run virtualized workloads<br> >> from <a href="http://ci.centos.org" rel="noreferrer" target="_blank">ci.centos.org</a>.<br> >><br> >> This amounts to around 35 less jobs per day running on Duffy<br> >> <a href="http://ci.centos.org" rel="noreferrer" target="_blank">ci.centos.org</a> hardware in total on a typical day (including generic<br> >> weirdo jobs and promote weirdo jobs).<br> >><br> >> I've re-shuffled the capacity around a bit, considering we've now<br> >> freed significant capacity for bare-metal based TripleO jobs.<br> >> The slave threads are now as follows:<br> >> - rdo-ci-slave01: 12 threads (up from 11), tagged with "rdo-promote"<br> >> and "rdo-promote-virtualized"<br> >> - rdo-ci-cloudslave01: 6 threads (up from 4), tagged with "rdo"<br> >> - rdo-ci-cloudslave02: 6 threads (up from 4), tagged with "rdo"<br> >> - rdo-ci-cloudslave03: 8 threads (up from 4), tagged with "rdo-virtualized"<br> >> - rdo-ci-cloudslave04: 8 threads (down from 15), tagged with "rdo-virtualized"<br> >><br> >> There is a specific reason why cloudslave03 and cloudslave04 amount to<br> >> 16 threads between the two, it is to match the quota we have been<br> >> given in terms of capacity at RDO cloud.<br> >> The threads will be used to artificially limit the amount of jobs run<br> >> against the cloud concurrently without needing to implement queueing<br> >> on our end.<br> >><br> >> You'll otherwise notice the net effect for the "rdo" and "rdo-promote"<br> >> tag isn't much, at least for the time being, it's very much the same<br> >> since I've re-allocated cloudslave03 to load balance virtualized jobs.<br> >> However, jobs are likely to be more reliable and faster now that they<br> >> won't have to retry for nodes because we're less likely to hit<br> >> rate-limiting.<br> >><br> >> I'll monitor the situation over the next few days and bump the numbers<br> >> if everything is looking good.<br> >> That said, I'd like to hear about your feedback if you feel things are<br> >> looking better and if we are running into "out of inventory" errors<br> >> less often.<br> >><br> >> Let me know if you have any questions,<br> >><br> >> David Moreau Simard<br> >> Senior Software Engineer | Openstack RDO<br> >><br> >> dmsimard = [irc, github, twitter]<br> </blockquote></div></div>