[rdo-list] HA overcloud-deploy.sh crashes again ( ControllerOvercloudServicesDeployment_Step4 )

Dan Sneddon dsneddon at redhat.com
Wed Jun 29 17:46:19 UTC 2016


On 06/29/2016 10:42 AM, Dan Sneddon wrote:
> On 06/29/2016 07:03 AM, Boris Derzhavets wrote:
>> Boris Derzhavets has shared a OneDrive file with you. To view it, click
>> the link below.
>>
>> <https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
>> 	
>> HeatCrash2.txt 1.gz <https://1drv.ms/u/s!AqjiDzRpwaKogSHAekH8ZluOaclk>
>> 	[HeatCrash2.txt 1.gz]
>>
>> Reattach gzip archive via One Drive
>>
>>
>>
>> -----------------------------------------------------------------------
>> *From:* rdo-list-bounces at redhat.com <rdo-list-bounces at redhat.com> on
>> behalf of Boris Derzhavets <bderzhavets at hotmail.com>
>> *Sent:* Wednesday, June 29, 2016 9:36 AM
>> *To:* John Trowbridge; shardy at redhat.com
>> *Cc:* rdo-list at redhat.com
>> *Subject:* [rdo-list] HA overcloud-deploy.sh crashes again (
>> ControllerOvercloudServicesDeployment_Step4 )
>>  
>>
>> Attempt to follow steps suggested
>> in http://hardysteven.blogspot.ru/2016/06/tripleo-partial-stack-updates.html
>>
>>
>> ./deploy-overstack crashes
>>
>>
>> 2016-06-29 12:42:41
>> [overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk-ControllerOvercloudServicesDeployment_Step4-nzdoizlgrmx2]:
>> CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment
>> to server failed: deploy_status_code : Deployment exited with non-zero
>> status code: 6
>> 2016-06-29 12:42:42 [ControllerOvercloudServicesDeployment_Step4]:
>> CREATE_FAILED Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:43
>> [overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk]: CREATE_FAILED
>> Resource CREATE failed: Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:44 [ControllerNodesPostDeployment]: CREATE_FAILED
>> Error:
>> resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:44 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:45 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:46 [overcloud]: CREATE_FAILED Resource CREATE failed:
>> Error:
>> resources.ControllerNodesPostDeployment.resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6
>> 2016-06-29 12:42:46 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:47 [2]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:47 [ControllerDeployment]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:48 [NetworkDeployment]: SIGNAL_COMPLETE Unknown
>> 2016-06-29 12:42:48 [2]: SIGNAL_COMPLETE Unknown
>> Stack overcloud CREATE_FAILED
>> Deployment failed:  Heat Stack create failed.
>> + heat stack-list
>> + grep -q CREATE_FAILED
>> + deploy_status=1
>> ++ heat resource-list --nested-depth 5 overcloud
>> ++ grep FAILED
>> ++ grep 'StructuredDeployment '
>> ++ cut -d '|' -f3
>> + for failed in '$(heat resource-list         --nested-depth 5
>> overcloud | grep FAILED |
>>         grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show 655c77fc-6a78-4cca-b4b7-a153a3f4ad52
>> + for failed in '$(heat resource-list         --nested-depth 5
>> overcloud | grep FAILED |
>>         grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show 1fe5153c-e017-4ee5-823a-3d1524430c1d
>> + for failed in '$(heat resource-list         --nested-depth 5
>> overcloud | grep FAILED |
>>         grep '\''StructuredDeployment '\'' | cut -d '\''|'\'' -f3)'
>> + heat deployment-show bf6f25f4-d812-41e9-a7a8-122de619a624
>> + exit 1
>>
>> *****************************
>> Troubleshooting steps :-
>> *****************************
>>
>> [stack at undercloud ~]$ . stackrc
>> [stack at undercloud ~]$  heat resource-list overcloud | grep
>> ControllerNodesPost
>> | ControllerNodesPostDeployment             |
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3          |
>> OS::TripleO::ControllerPostDeployment             | CREATE_FAILED   |
>> 2016-06-29T12:11:21 |
>>
>>
>> [stack at undercloud ~]$ heat stack-list -n | grep "^|
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3"
>> | f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3 |
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk                                                         
>> | CREATE_FAILED   | 2016-06-29T12:31:11 | None         |
>> 17f82f6e-e0ca-44c6-9058-de82c00d4f79 |
>>
>>
>>
>> [stack at undercloud ~]$ heat event-list -m
>> f1d6a474-c946-46bf-ab0c-2fdaeb55d0b3
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>>
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>> | resource_name                                        |
>> id                                   |
>> resource_status_reason                                                                                                                                                                            
>> | resource_status    | event_time          |
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk |
>> 10ec0cf9-b3c9-4191-9966-3f4d47f27e2a | Stack CREATE started   
>> . . . . . . . . . . . . . . . . .
>> Step1,2,3 succeeded
>> . . . . . . . . . . . . . . . . .
>>                                                                                                                                                                        
>> | CREATE_IN_PROGRESS | 2016-06-29T12:31:14 |
>> | ControllerPuppetConfig                               |
>> a2a1df33-5106-425c-b16d-8d2df709b19f | state
>> changed                                                                                                                                                                                                                                                                    
>> | CREATE_COMPLETE    | 2016-06-29T12:35:02 |
>> | ControllerOvercloudServicesDeployment_Step4          |
>> 1e151333-4de5-4e7b-907c-ea0f42d31a47 | state
>> changed                                                                                                                                                                                     
>> | CREATE_IN_PROGRESS | 2016-06-29T12:35:03 |
>> | ControllerOvercloudServicesDeployment_Step4          |
>> 7bf36334-3d92-4554-b6c0-41294a072ab6 | Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6                         | CREATE_FAILED      |
>> 2016-06-29T12:42:42 |
>> | overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk
>>  | e72fb6f4-c2aa-4fe8-9bd1-5f5ad152685c | Resource CREATE failed:
>> Error:
>> resources.ControllerOvercloudServicesDeployment_Step4.resources[0]:
>> Deployment to server failed: deploy_status_code: Deployment exited with
>> non-zero status code: 6 | CREATE_FAILED      | 2016-06-29T12:42:43 |
>> +------------------------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+---------------------+
>>
>> [stack at undercloud ~]$ heat stack-show
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk | grep
>> NodeConfigIdentifiers
>> |                       |   "NodeConfigIdentifiers":
>> "{u'deployment_identifier': 1467202276, u'controller_config': {u'1':
>> u'os-apply-config deployment 796df02a-7550-414b-a084-8b591a13e6db
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
>> u'0': u'os-apply-config deployment 613ec889-d852-470a-8e4c-6e243e1d2033
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,',
>> u'2': u'os-apply-config deployment c8b099d0-3af4-4ba0-a056-a0ce60f40e2d
>> completed,Root CA cert injection not enabled.,TLS not enabled.,None,'},
>> u'allnodes_extra': u'none'}" |
>>
>> However, when stack creating crashed update wouldn't help.
>>
>> [stack at undercloud ~]$ heat stack-update -x
>> overcloud-ControllerNodesPostDeployment-2r4tlv5icaxk   -e update_env.yaml
>> ERROR: PATCH update to non-COMPLETE stack is not supported.
>>
>> DUE TO :-
>>
>> [stack at undercloud ~]$ heat stack-list
>> +--------------------------------------+------------+---------------+---------------------+--------------+
>> | id                                   | stack_name | stack_status  |
>> creation_time       | updated_time |
>> +--------------------------------------+------------+---------------+---------------------+--------------+
>> | 17f82f6e-e0ca-44c6-9058-de82c00d4f79 | overcloud  | CREATE_FAILED |
>> 2016-06-29T12:11:20 | None         |
>> +--------------------------------------+------------+---------------+---------------------+------
>>
>>
>> Complete error file `heat deployment-show
>> 655c77fc-6a78-4cca-b4b7-a153a3f4ad52` is  attached a gzip archive.
>>
>>
>> Thanks.
>>
>> Boris.
>>
>>
>>
>> _______________________________________________
>> rdo-list mailing list
>> rdo-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscribe at redhat.com
>>
> 
> The failure occurred during the post-deployment, which means that the
> initial deployment succeeded, but then the steps that are done to the
> completed overcloud failed.
> 
> This is most commonly attributable to network problems between the
> Undercloud and the Overcloud Public API. The Undercloud needs to reach
> the Public API in order to do some of the post-configuration steps. If
> this API isn't reachable, you end up with the error you saw above.
> 
> You can test this connectivity by pinging the Public API VIP from the
> Undercloud. Starting with the failed deployment, run "neutron
> port-list" against the Underlcloud and look for the IP on the port
> named "public_virtual_ip". You should be able to ping this address from
> the Undercloud. If you can't reach that IP, then you need to check the
> connectivity/routing between the Undercloud and the External network on
> the Overcloud.
> 

I should also mention common causes of this problem:

* Incorrect value for ExternalInterfaceDefaultRoute in the network
environment file.
* Controllers do not have the default route on the External network in
the NIC config templates (required for reachability from remote subnets).
* Incorrect subnet mask on the ExternalNetCidr in the network environment.
* Incorrect ExternalAllocationPools values in the network environment.
* Incorrect Ethernet switch config for the Controllers.

-- 
Dan Sneddon         |  Principal OpenStack Engineer
dsneddon at redhat.com |  redhat.com/openstack
650.254.4025        |  dsneddon:irc   @dxs:twitter




More information about the rdo-list mailing list