[Platformone] [External] Re: cluster issue submitted - pod will not deploy - taint errors

Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC] jennifer.starling at accenturefederal.com
Tue Dec 17 15:29:36 UTC 2019


Any updates on the cluster?  We are blocked.

From: Jonathan Rickard <jrickard at redhat.com>
Date: Monday, December 16, 2019 at 11:47 AM
To: "Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC]" <jennifer.starling at accenturefederal.com>
Cc: "platformONE at redhat.com" <platformONE at redhat.com>, "Jared Crace [Mantech]" <Jared.Crace at mantech.com>, "Joseph Middleton (Confluence)" <confluence-noreply at di2e.net>, "Mark Sanchez [Gov]" <mark.sanchez.8 at us.af.mil>
Subject: [External] Re: [Platformone] cluster issue submitted - pod will not deploy - taint errors


This message is from an EXTERNAL SENDER - be CAUTIOUS of links and attachments. THINK BEFORE YOU CLICK.
________________________________

Jennifer,

Thank you for providing this information. We are actively engaged to restore service to the cluster. FYI, the issue is related to the number of devices attached to each of the  instances.

jonny

Jonathan Rickard, RHCE, RHCA

Consulting Architect

Red Hat Public Sector<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254352180&sdata=lHLkrMLT1G71rcwXnPrmts1qkPdcVxUoihGo64JJj%2Bg%3D&reserved=0>

jonny at redhat.com<mailto:jonny at redhat.com>
M: 210.862.9739<tel:210.862.9739>
@redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254362180&sdata=R9%2F3y7vNpXH1b0vjme%2BgNKH74ACjyK6HXKYmIWYmtB0%3D&reserved=0>   redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254362180&sdata=jXMoBDRVdI2QRfrQEEqw1XaRm5QUKCz31PSPZauMwH0%3D&reserved=0> @redhatjobs<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Finstagram.com%2Fredhatjobs&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254372165&sdata=bZjZ92txm%2FyNsoMxA6eQOuN74yAy4Q3allXbNHXR85w%3D&reserved=0>
[Image removed by sender.]<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254372165&sdata=Xaixdi8nCmOkHVaRCsEeEn7isz%2BKE1gj7Sgn4t1oq84%3D&reserved=0>


On Mon, Dec 16, 2019 at 10:39 AM Starling, Jennifer [MERIDIAN TECHNOLOGIES I, LLC] <jennifer.starling at accenturefederal.com<mailto:jennifer.starling at accenturefederal.com>> wrote:
I submitted an issue.   https://dccscr.dsop.io/unified-platform-node-aam/misp-images/issues/1<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdccscr.dsop.io%2Funified-platform-node-aam%2Fmisp-images%2Fissues%2F1&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=LBFbTTEvUNSdFyQ30DY1MlxBmBjA7CGX7uRYpsZM39U%3D&reserved=0>
I do not know how to make it a blocker, but it is a blocker.  When this happens, we cannot create new pods or re-deploy in any projects.

Thank you for your assistance,  Please let me know if you require further information.

Jen…

Our pod is getting errors about taint. https://cluster.unified-platform.io/console/project/saml-auth-proto-misp-app/browse/events<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcluster.unified-platform.io%2Fconsole%2Fproject%2Fsaml-auth-proto-misp-app%2Fbrowse%2Fevents&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=NdDpMZM8AIUS%2FiEIffSr5aLqs1DrKpYQ0yTRjWi3Y%2Fo%3D&reserved=0>

cluster: https://cluster.unified-platform.io<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcluster.unified-platform.io&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254382159&sdata=Ab%2FmwWajBP4%2BMXgcfgvgR7CBSGxC%2BaEiuihXClKaFAc%3D&reserved=0> project: saml-auth-proto-misp-app

I verified the image exists. It was running and then we tried to re-deploy it. Once one of the pods gets in this state we cannot start up any new pods even in other projects.

10:28:11 AM    saml-auth-proto-misp-app-web-2-deploy  Pod     Warning        Failed Scheduling     0/9 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 6 node(s) didn't match node selector.

6021 times in the last 2 hours

10:26:15 AM    rf-domain-sync-1576224000-jxqvt       Pod     Normal  Pulling          pulling image "docker-registry.default.svc:5000/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image:latest"

4132 times in the last 3 days

10:26:04 AM    rf-domain-sync-1576224000-jxqvt       Pod     Warning        Failed          Error: ErrImagePull

106 times in the last 2 days

10:26:04 AM    rf-domain-sync-1576224000-jxqvt       Pod     Warning        Failed          Failed to pull image "docker-registry.default.svc:5000/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image:latest": rpc error: code = Unknown desc = Get https://docker-registry.default.svc:5000/v2/saml-auth-proto-misp-app/saml-auth-proto-misp-app-rf-feed-image/manifests/latest<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocker-registry.default.svc%3A5000%2Fv2%2Fsaml-auth-proto-misp-app%2Fsaml-auth-proto-misp-app-rf-feed-image%2Fmanifests%2Flatest&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254392156&sdata=AGBd87JIKPUu%2BaLrl3K3Trzonx8VE50IE2q6EaW4xH8%3D&reserved=0>: Get https://docker-registry.default.svc:5000/openshift/token?account=serviceaccount&scope=repository%3Asaml-auth-proto-misp-app%2Fsaml-auth-proto-misp-app-rf-feed-image%3Apull<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocker-registry.default.svc%3A5000%2Fopenshift%2Ftoken%3Faccount%3Dserviceaccount%26scope%3Drepository%253Asaml-auth-proto-misp-app%252Fsaml-auth-proto-misp-app-rf-feed-image%253Apull&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254392156&sdata=3Cajr3s00v38E1PXAe8bK%2BPL5IJIK%2FQ8JphDoA3vC%2F0%3D&reserved=0>: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

41 times in the last 15 hours

_______________________________________________
platformONE mailing list
platformONE at redhat.com<mailto:platformONE at redhat.com>
https://www.redhat.com/mailman/listinfo/platformone<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2Fmailman%2Flistinfo%2Fplatformone&data=02%7C01%7Cjennifer.starling%40accenturefederal.com%7C73955b2ce8cd419319e708d7824ff5ea%7C0ee6c63b4eab4748b74ad1dc22fc1a24%7C0%7C0%7C637121152254402150&sdata=W1zrusWHBzqs1XjrGQvO3LaAKDtenmjyj5b5UIZVhpc%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/platformone/attachments/20191217/785861af/attachment.htm>


More information about the platformONE mailing list