@@ -24,6 +24,9 @@ This guide covers the following types of hosts:
2424- Compute hosts
2525- Storage hosts
2626- Seed
27+
28+ The following types of hosts will be covered in future:
29+
2730- Seed hypervisor
2831- Ansible control host
2932- Wazuh manager
@@ -61,8 +64,9 @@ Configuration
6164
6265Make the following changes to your Kayobe configuration:
6366
64- - Set ``os_distribution `` to ``rocky `` in ``etc/kayobe/globals.yml ``
65- - Set ``os_release `` to ``"9" `` in ``etc/kayobe/globals.yml ``
67+ - Merge in the latest ``stackhpc-kayobe-config `` ``stackhpc/yoga `` branch.
68+ - Set ``os_distribution `` to ``rocky `` in ``etc/kayobe/globals.yml ``.
69+ - Set ``os_release `` to ``"9" `` in ``etc/kayobe/globals.yml ``.
6670- If you are using Kayobe multiple environments, add the following into
6771 ``kayobe-config/etc/kayobe/environments/<env>/kolla/config/nova.conf ``
6872 (as Kolla custom service config environment merging is not supported in
@@ -166,16 +170,11 @@ Deploy latest CentOS Stream 8 images
166170------------------------------------
167171
168172Make sure you deploy the latest CentOS Stream 8 containers prior to
169- this migration.
170-
171- The usual steps apply:
172-
173- - Merge in the latest changes from the ``stackhpc-kayobe-config `` ``stackhpc/yoga `` branch
174- - Upgrade services
173+ this migration:
175174
176- .. code-block :: console
175+ .. code-block :: console
177176
178- kayobe overcloud service deploy
177+ kayobe overcloud service deploy
179178
180179 Controllers
181180===========
@@ -220,43 +219,82 @@ Full procedure for one host
220219
221220 kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
222221
223- 4. Deprovision the controller:
222+ 4. If the controller is running Ceph services:
223+
224+ 1. Set host in maintenance mode:
225+
226+ .. code-block :: console
227+
228+ ceph orch host maintenance enter <hostname>
229+
230+ 2. Check there's nothing remaining on the host:
231+
232+ .. code-block :: console
233+
234+ ceph orch ps <hostname>
235+
236+ 5. Deprovision the controller:
224237
225238 .. code :: console
226239
227240 kayobe overcloud deprovision -l <hostname>
228241
229- 5 . Reprovision the controller:
242+ 6 . Reprovision the controller:
230243
231244 .. code :: console
232245
233246 kayobe overcloud provision -l <hostname>
234247
235- 6 . Host configure:
248+ 7 . Host configure:
236249
237250 .. code :: console
238251
239252 kayobe overcloud host configure -l <hostname> -kl <hostname>
240253
241- 7. Service deploy on all controllers:
254+ 8. If the controller is running Ceph OSD services:
255+
256+ 1. Make sure the cephadm public key is in ``authorized_keys `` for stack or
257+ root user - depends on your setup. For example, your SSH key may
258+ already be defined in ``users.yml `` . If in doubt, run the cephadm
259+ deploy playbook to copy the SSH key and install the cephadm binary.
260+
261+ .. code-block :: console
262+
263+ kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
264+
265+ 2. Take the host out of maintenance mode:
266+
267+ .. code-block :: console
268+
269+ ceph orch host maintenance exit <hostname>
270+
271+ 3. Make sure that everything is back in working condition before moving
272+ on to the next host:
273+
274+ .. code-block :: console
275+
276+ ceph -s
277+ ceph -w
278+
279+ 9. Service deploy on all controllers:
242280
243281 .. code :: console
244282
245283 kayobe overcloud service deploy -kl controllers
246284
247- 8 . If using OVN, check OVN northbound DB cluster state on all controllers to see if the new host has joined:
285+ 10 . If using OVN, check OVN northbound DB cluster state on all controllers to see if the new host has joined:
248286
249- .. code :: console
287+ .. code :: console
250288
251- kayobe overcloud host command run --command 'docker exec -it ovn_nb_db ovs-appctl -t /run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound' --show-output -l controllers
289+ kayobe overcloud host command run --command 'docker exec -it ovn_nb_db ovs-appctl -t /run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound' --show-output -l controllers
252290
253- 9 . If using OVN, check OVN southbound DB cluster state on all controllers to see if the new host has joined:
291+ 11 . If using OVN, check OVN southbound DB cluster state on all controllers to see if the new host has joined:
254292
255- .. code :: console
293+ .. code :: console
256294
257- kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
295+ kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
258296
259- 10 . Some MariaDB instability has been observed. The exact cause is unknown but
297+ 12 . Some MariaDB instability has been observed. The exact cause is unknown but
260298 the simplest fix seems to be to run the Kayobe database recovery tool
261299 between migrations.
262300
@@ -291,25 +329,64 @@ Full procedure for one batch of hosts
291329
292330 kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/nova-compute-{disable,drain}.yml --limit <host>
293331
294- 2. Deprovision the compute node:
332+ 2. If the compute node is running Ceph OSD services:
333+
334+ 1. Set host in maintenance mode:
335+
336+ .. code-block :: console
337+
338+ ceph orch host maintenance enter <hostname>
339+
340+ 2. Check there's nothing remaining on the host:
341+
342+ .. code-block :: console
343+
344+ ceph orch ps <hostname>
345+
346+ 3. Deprovision the compute node:
295347
296348 .. code :: console
297349
298350 kayobe overcloud deprovision -l <hostname>
299351
300- 3 . Reprovision the compute node:
352+ 4 . Reprovision the compute node:
301353
302354 .. code :: console
303355
304356 kayobe overcloud provision -l <hostname>
305357
306- 4 . Host configure:
358+ 5 . Host configure:
307359
308360 .. code :: console
309361
310362 kayobe overcloud host configure -l <hostname> -kl <hostname>
311363
312- 5. Service deploy:
364+ 6. If the compute node is running Ceph OSD services:
365+
366+ 1. Make sure the cephadm public key is in ``authorized_keys `` for stack or
367+ root user - depends on your setup. For example, your SSH key may
368+ already be defined in ``users.yml `` . If in doubt, run the cephadm
369+ deploy playbook to copy the SSH key and install the cephadm binary.
370+
371+ .. code-block :: console
372+
373+ kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
374+
375+ 2. Take the host out of maintenance mode:
376+
377+ .. code-block :: console
378+
379+ ceph orch host maintenance exit <hostname>
380+
381+ 3. Make sure that everything is back in working condition before moving
382+ on to the next host:
383+
384+ .. code-block :: console
385+
386+ ceph -s
387+ ceph -w
388+
389+ 7. Service deploy:
313390
314391 .. code :: console
315392
@@ -320,8 +397,6 @@ If any VMs were powered off, they may now be powered back on.
320397Wait for Prometheus alerts and errors in OpenSearch Dashboard to resolve, or
321398address them.
322399
323- After updating controllers or network hosts, run any appropriate smoke tests.
324-
325400Once happy that the system has been restored to full health, move onto the next
326401host or batch or hosts.
327402
@@ -380,13 +455,13 @@ Full procedure for any storage host
380455
381456 kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
382457
383- 6 . Take the host out of maintenance mode:
458+ 7 . Take the host out of maintenance mode:
384459
385460 .. code-block :: console
386461
387462 ceph orch host maintenance exit <hostname>
388463
389- 7 . Make sure that everything is back in working condition before moving
464+ 8 . Make sure that everything is back in working condition before moving
390465 on to the next host:
391466
392467 .. code-block :: console
@@ -426,75 +501,81 @@ Full procedure
426501
427502 lsblk
428503
429- 2. If the data volume is not mounted at either ``/var/lib/docker `` or
504+ 2. Use `mysqldump
505+ <https://docs.openstack.org/kayobe/yoga/administration/seed.html#database-backup-restore> `_
506+ to take a backup of the MariaDB database. Copy the backup file to one of
507+ the Bifrost container's persistent volumes, such as ``/var/lib/ironic/ `` in
508+ the ``bifrost_deploy `` container.
509+
510+ 3. If the data volume is not mounted at either ``/var/lib/docker `` or
430511 ``/var/lib/docker/volumes ``, make an external copy of the data
431512 somewhere on the seed hypervisor.
432513
433- 3 . On the seed, stop the MariaDB process within the bifrost_deploy
514+ 4 . On the seed, stop the MariaDB process within the bifrost_deploy
434515 container:
435516
436517 .. code :: console
437518
438519 sudo docker exec bifrost_deploy systemctl stop mariadb
439520
440- 4 . On the seed, stop docker:
521+ 5 . On the seed, stop docker:
441522
442523 .. code :: console
443524
444525 sudo systemctl stop docker
445526
446- 5 . On the seed, shut down the host:
527+ 6 . On the seed, shut down the host:
447528
448529 .. code :: console
449530
450531 sudo systemctl poweroff
451532
452- 6 . Wait for the VM to shut down:
533+ 7 . Wait for the VM to shut down:
453534
454535 .. code :: console
455536
456537 watch sudo virsh list --all
457538
458- 7 . Back up the VM volumes on the seed hypervisor
539+ 8 . Back up the VM volumes on the seed hypervisor
459540
460541 .. code :: console
461542
462543 sudo mkdir /var/lib/libvirt/images/backup
463544 sudo cp -r /var/lib/libvirt/images /var/lib/libvirt/images/backup
464545
465- 8 . Delete the seed root volume (check the structure & naming
546+ 9 . Delete the seed root volume (check the structure & naming
466547 conventions first)
467548
468549 .. code :: console
469550
470551 sudo virsh vol-delete seed-root --pool default
471552
472- 9 . Reprovision the seed
553+ 10 . Reprovision the seed
473554
474- .. code :: console
555+ .. code :: console
475556
476- kayobe seed vm provision
557+ kayobe seed vm provision
477558
478- 10 . Seed host configure
559+ 11 . Seed host configure
479560
480561 .. code :: console
481562
482563 kayobe seed host configure
483564
484- 11 . Rebuild seed container images (if using locally-built rather than
565+ 12 . Rebuild seed container images (if using locally-built rather than
485566 release train images)
486567
487568 .. code :: console
488569
489570 kayobe seed container image build --push
490571
491- 12 . Service deploy
572+ 13 . Service deploy
492573
493574 .. code :: console
494575
495576 kayobe seed service deploy
496577
497- 13 . Verify that Bifrost/Ironic is healthy.
578+ 14 . Verify that Bifrost/Ironic is healthy.
498579
499580Seed hypervisor
500581===============
0 commit comments