[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [rdo-list] [ci] FYI all rdo promotion jobs have been disabled





On Mon, Sep 12, 2016 at 5:54 PM, Wesley Hayutin <whayutin redhat com> wrote:


On Mon, Sep 12, 2016 at 5:52 PM, David Moreau Simard <dms redhat com> wrote:
Can we keep the promotion jobs enabled and just disable the jobs that
actually upload the images ?

We need visibility on ongoing issues, if there are any, and the jobs
have already been disabled since test day last week.

Ya.. that is a better idea.. will adjust them.
Thanks
 

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Mon, Sep 12, 2016 at 5:37 PM, Wesley Hayutin <whayutin redhat com> wrote:
> Greetings,
>
> I have disabled all the RDO promotion jobs until such time we have confirmed
> that images are published directly from the virthost to the ci.centos
> artifacts server.
>
> This work is being lead by Matt Young and tested and refined by John
> Trowbridge and myself. If there are any requirements for promotion while
> this work is done we will utilize the internal pipeline.
>
> Thank you
>
> _______________________________________________
> rdo-list mailing list
> rdo-list redhat com
> https://www.redhat.com/mailman/listinfo/rdo-list
>
> To unsubscribe: rdo-list-unsubscribe redhat com


Greetings, 

TLDR:  
There are unique properties to the infra at ci.centos that make it difficult to work with qcow2 images.  Transferring images across the ci.centos infra can cause instability in the ci.centos infrastructure itself.  The nature of this process is slow and difficult to test however we now believe we have resolved the issue.

Details:

I wanted to send an update to the community regarding the promotion status of RDO via CI.
The issue at hand was syncing and promoting tripleo undercloud and overcloud images to the ci.centos artifacts server.
Originally these images were synced in two steps, first synced to the jenkins slave and then to the artifacts server.  

Syncing the images to the slave was causing instability in the ci.centos infra and also causing network and filesystem issues on the ci.centos slave.
Quite simply, image syncs were disrupting the infrastructure and causing failures throughout ci.centos.

Through a series of patches [1-8] we have streamlined the image creation and promotion process to only sync the image once.  The artifact server *only* has
rsync available, there is no ssh service available.  It was a complicated problem to solve, but we think we have the code required merged and tests are running.

Apologies for the outtage however, we had to choose between an outage and bringing down ci.centos infra.  We chose to take an outtage immediately after newton milestone 3 was promoted and the internal beta was imported.

Thank you and we appreciate your patience.  

[1] https://review.gerrithub.io/#/c/290337/
[2] https://review.gerrithub.io/#/c/290344/
[3] https://review.gerrithub.io/#/c/290432/
[4] https://review.gerrithub.io/#/c/290433/
[5] https://review.gerrithub.io/#/c/294590/
[6] https://review.gerrithub.io/#/c/294663/
[7] https://review.gerrithub.io/#/c/294672/
[8] https://review.gerrithub.io/#/c/294694/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]