@@ -233,11 +233,14 @@ Potential issues
233233 .. code-block :: yaml
234234
235235 mariabackup_image_full : " {{ docker_registry }}/stackhpc/rocky-source-mariadb-server:yoga-20230310T170929"
236- - When using Octavia load balancers, restarting Neutron causes load balancers
237- with floating IPs to stop processing traffic. See `LP#2042938
238- <https://bugs.launchpad.net/neutron/+bug/2042938> `__ for details. The issue
239- may be worked around after Neutron has been restarted by detaching then
240- reattaching the floating IP to the load balancer's virtual IP.
236+ - When using Octavia load balancers, restarting Neutron causes load balancers
237+ with floating IPs to stop processing traffic. See `LP#2042938
238+ <https://bugs.launchpad.net/neutron/+bug/2042938> `__ for details. The issue
239+ may be worked around after Neutron has been restarted by detaching then
240+ reattaching the floating IP to the load balancer's virtual IP.
241+
242+ - If you are using hyper-converged Ceph, please also note the potential issues
243+ in the Storage section below.
241244
242245Full procedure for one host
243246---------------------------
@@ -466,6 +469,44 @@ Potential issues
466469 be identical, now that the "maintenance mode approach" is being used.
467470 It is still recommended to do the bootstrap host last.
468471
472+ - Prior to reprovisioning the bootstrap host, it can be beneficial to backup
473+ ``/etc/ceph `` and ``/var/lib/ceph ``, as sometimes the keys, config, etc.
474+ stored here will not be moved/recreated correctly.
475+
476+ - When a host is taken out of maintenance, you may see errors relating to
477+ permissions of /tmp/etc and /tmp/var. These issues should be resolved in
478+ Ceph version 17.2.7. See issue: https://github.com/ceph/ceph/pull/50736. In
479+ the meantime, you can work around this by running the command below. You may
480+ need to omit one or the other of ``/tmp/etc `` and ``/tmp/var ``. You will
481+ likely need to run this multiple times. Run ``ceph -W cephadm `` to monitor
482+ the logs and see when permissions issues are hit.
483+
484+ .. code-block :: console
485+
486+ kayobe overcloud host command run --command "chown -R stack:stack /tmp/etc /tmp/var" -b -l storage
487+
488+ - It has been seen that sometimes the Ceph containers do not come up after
489+ reprovisioning. This seems to be related to having ``/var/lib/ceph ``
490+ persisted through the reprovision (e.g. seen at a customer in a volume
491+ with software RAID). (Note: further investigation is needed for the root
492+ cause). When this occurs, you will need to redeploy the daemons:
493+
494+ List the daemons on the host:
495+
496+ .. code-block :: console
497+
498+ ceph orch ps <hostname>
499+
500+
501+ Redeploy the daemons, one at a time. It is recommended that you start with
502+ the crash daemon, as this will have the least impact if unexpected issues
503+ occur.
504+
505+ .. code-block :: console
506+
507+ ceph orch daemon redeploy <daemon name> to redeploy a daemon.
508+
509+
469510 - Commands starting with ``ceph `` are all run on the cephadm bootstrap
470511 host in a cephadm shell unless stated otherwise.
471512
0 commit comments