[Platformone] [External] Re: cluster issue submitted - pod will not deploy - taint errors
Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC]
jennifer.starling at accenturefederal.com
Tue Dec 17 15:29:36 UTC 2019
Any updates on the cluster? We are blocked.
From: Jonathan Rickard <jrickard at redhat.com>
Date: Monday, December 16, 2019 at 11:47 AM
To: "Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC]" <jennifer.starling at accenturefederal.com>
Cc: "platformONE at redhat.com" <platformONE at redhat.com>, "Jared Crace [Mantech]" <Jared.Crace at mantech.com>, "Joseph Middleton (Confluence)" <confluence-noreply at di2e.net>, "Mark Sanchez [Gov]" <mark.sanchez.8 at us.af.mil>
Subject: [External] Re: [Platformone] cluster issue submitted - pod will not deploy - taint errors
This message is from an EXTERNAL SENDER - be CAUTIOUS of links and attachments. THINK BEFORE YOU CLICK.
________________________________
Jennifer,
Thank you for providing this information. We are actively engaged to restore service to the cluster. FYI, the issue is related to the number of devices attached to each of the instances.
jonny
Jonathan Rickard, RHCE, RHCA
Consulting Architect
Red Hat Public Sector<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254352180&sdata=lHLkrMLT1G71rcwXnPrmts1qkPdcVxUoihGo64JJj%2Bg%3D&reserved=0>
jonny at redhat.com<mailto:jonny at redhat.com>
M: 210.862.9739<tel:210.862.9739>
@redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254362180&sdata=R9%2F3y7vNpXH1b0vjme%2BgNKH74ACjyK6HXKYmIWYmtB0%3D&reserved=0> redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254362180&sdata=jXMoBDRVdI2QRfrQEEqw1XaRm5QUKCz31PSPZauMwH0%3D&reserved=0> @redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Finstagram.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254372165&sdata=bZjZ92txm%2FyNsoMxA6eQOuN74yAy4Q3allXbNHXR85w%3D&reserved=0>
[Image removed by sender.]<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254372165&sdata=Xaixdi8nCmOkHVaRCsEeEn7isz%2BKE1gj7Sgn4t1oq84%3D&reserved=0>
On Mon, Dec 16, 2019 at 10:39 AM Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC] <jennifer.starling at accenturefederal.com<mailto:jennifer.starling at accenturefederal.com>> wrote:
I submitted an issue. https://dccscr.dsop.io/unified-platform-node-aam/misp-images/issues/1<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdccscr.dsop.io%2Funified-platform-node-aam%2Fmisp-images%2Fissues%2F1&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=LBFbTTEvUNSdFyQ30DY1MlxBmBjA7CGX7uRYpsZM39U%3D&reserved=0>
I do not know how to make it a blocker, but it is a blocker. When this happens, we cannot create new pods or re-deploy in any projects.
Thank you for your assistance, Please let me know if you require further information.
Jen…
Our pod is getting errors about taint. https://cluster.unified-platform.io/console/project/saml-auth-proto-misp-app/browse/events<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcluster.unified-platform.io%2Fconsole%2Fproject%2Fsaml-auth-proto-misp-app%2Fbrowse%2Fevents&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=NdDpMZM8AIUS%2FiEIffSr5aLqs1DrKpYQ0yTRjWi3Y%2Fo%3D&reserved=0>
cluster: https://cluster.unified-platform.io<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcluster.unified-platform.io&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=Ab%2FmwWajBP4%2BMXgcfgvgR7CBSGxC%2BaEiuihXClKaFAc%3D&reserved=0> project: saml-auth-proto-misp-app
I verified the image exists. It was running and then we tried to re-deploy it. Once one of the pods gets in this state we cannot start up any new pods even in other projects.
10:28:11 AM saml-auth-proto-misp-app-web-2-deploy Pod Warning Failed Scheduling 0/9 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 6 node(s) didn't match node selector.
6021 times in the last 2 hours
10:26:15 AM rf-domain-sync-1576224000-jxqvt Pod Normal Pulling pulling image "docker-registry.default.svc:5000/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image:latest"
4132 times in the last 3 days
10:26:04 AM rf-domain-sync-1576224000-jxqvt Pod Warning Failed Error: ErrImagePull
106 times in the last 2 days
10:26:04 AM rf-domain-sync-1576224000-jxqvt Pod Warning Failed Failed to pull image "docker-registry.default.svc:5000/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image:latest": rpc error: code = Unknown desc = Get https://docker-registry.default.svc:5000/v2/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image/manifests/latest<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocker-registry.default.svc%3A5000%2Fv2%2Fsaml-auth-proto-misp-app%2Fsaml-auth-proto-misp-app-rf-feed-image%2Fmanifests%2Flatest&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254392156&sdata=AGBd87JIKPUu%2BaLrl3K3Trzonx8VE50IE2q6EaW4xH8%3D&reserved=0>: Get https://docker-registry.default.svc:5000/openshift/token?account=serviceaccount&scope=repository%3Asaml-auth-proto-misp-app%2Fsaml-auth-proto-misp-app-rf-feed-image%3Apull<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocker-registry.default.svc%3A5000%2Fopenshift%2Ftoken%3Faccount%3Dserviceaccount%26scope%3Drepository%253Asaml-auth-proto-misp-app%252Fsaml-auth-proto-misp-app-rf-feed-image%253Apull&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254392156&sdata=3Cajr3s00v38E1PXAe8bK%2BPL5IJIK%2FQ8JphDoA3vC%2F0%3D&reserved=0>: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
41 times in the last 15 hours
_______________________________________________
platformONE mailing list
platformONE at redhat.com<mailto:platformONE at redhat.com>
https://www.redhat.com/mailman/listinfo/platformone<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2Fmailman%2Flistinfo%2Fplatformone&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254402150&sdata=W1zrusWHBzqs1XjrGQvO3LaAKDtenmjyj5b5UIZVhpc%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/platformone/attachments/20191217/785861af/attachment.htm>
More information about the platformONE
mailing list