|
| 1 | +:_mod-docs-content-type: ASSEMBLY |
| 2 | +[id="das-about-dynamic-accelerator-slicer-operator"] |
| 3 | += Dynamic Accelerator Slicer (DAS) Operator |
| 4 | +include::_attributes/common-attributes.adoc[] |
| 5 | +:context: das-about-dynamic-accelerator-slicer-operator |
| 6 | + |
| 7 | +toc::[] |
| 8 | + |
| 9 | +:FeatureName: Dynamic Accelerator Slicer Operator |
| 10 | + |
| 11 | +include::snippets/technology-preview.adoc[] |
| 12 | + |
| 13 | +The Dynamic Accelerator Slicer (DAS) Operator allows you to dynamically slice GPU accelerators in {product-title}, instead of relying on statically sliced GPUs defined when the node is booted. This allows you to dynamically slice GPUs based on specific workload demands, ensuring efficient resource utilization. |
| 14 | + |
| 15 | +Dynamic slicing is useful if you do not know all the accelerator partitions needed in advance on every node on the cluster. |
| 16 | + |
| 17 | +The DAS Operator currently includes a reference implementation for NVIDIA Multi-Instance GPU (MIG) and is designed to support additional technologies such as NVIDIA MPS or GPUs from other vendors in the future. |
| 18 | + |
| 19 | +.Limitations |
| 20 | + |
| 21 | +The following limitations apply when using the Dynamic Accelerator Slicer Operator: |
| 22 | + |
| 23 | + * You need to identify potential incompatibilities and ensure the system works seamlessly with various GPU drivers and operating systems. |
| 24 | + |
| 25 | + * The Operator only works with specific MIG compatible NVIDIA GPUs and drivers, such as H100 and A100. |
| 26 | + |
| 27 | + * The Operator cannot use only a subset of the GPUs of a node. |
| 28 | + |
| 29 | + * The NVIDIA device plugin cannot be used together with the Dynamic Accelerator Slicer Operator to manage the GPU resources of a cluster. |
| 30 | + |
| 31 | +[NOTE] |
| 32 | +==== |
| 33 | +The DAS Operator is designed to work with MIG-enabled GPUs. It allocates MIG slices instead of whole GPUs. Installing the DAS Operator prevents the use of the standard resource request through the NVIDIA device plugin such as `nvidia.com/gpu: "1"`, for allocating the entire GPU. |
| 34 | +==== |
| 35 | + |
| 36 | +//Installing the Dynamic Accelerator Slicer Operator |
| 37 | +include::modules/das-operator-installing.adoc[leveloffset=+1] |
| 38 | + |
| 39 | +//Installing the Dynamic Accelerator Slicer Operator using the web console |
| 40 | +include::modules/das-operator-installing-web-console.adoc[leveloffset=+2] |
| 41 | +[role="_additional-resources"] |
| 42 | +.Additional resources |
| 43 | +** xref:../security/cert_manager_operator/cert-manager-operator-install.adoc#cert-manager-operator-install[{cert-manager-operator}] |
| 44 | +** xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery (NFD) Operator] |
| 45 | +** link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html[NVIDIA GPU Operator] |
| 46 | + |
| 47 | +** link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/specialized_hardware_and_driver_enablement/psap-node-feature-discovery-operator#creating-nfd-cr-web-console_psap-node-feature-discovery-operator[NodeFeatureDiscovery CR] |
| 48 | + |
| 49 | +//Installing the Dynamic Accelerator Slicer Operator using the CLI |
| 50 | +include::modules/das-operator-installing-cli.adoc[leveloffset=+2] |
| 51 | +[role="_additional-resources"] |
| 52 | +.Additional resources |
| 53 | +* xref:../security/cert_manager_operator/cert-manager-operator-install.adoc#cert-manager-operator-install[{cert-manager-operator}] |
| 54 | +* xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery (NFD) Operator] |
| 55 | +* link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html[NVIDIA GPU Operator] |
| 56 | +* link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/specialized_hardware_and_driver_enablement/psap-node-feature-discovery-operator#creating-nfd-cr-cli_psap-node-feature-discovery-operator[NodeFeatureDiscovery CR] |
| 57 | + |
| 58 | +//Uninstalling the Dynamic Accelerator Slicer Operator |
| 59 | +include::modules/das-operator-uninstalling.adoc[leveloffset=+1] |
| 60 | + |
| 61 | +//Uninstalling the Dynamic Accelerator Slicer Operator using the web console |
| 62 | +include::modules/das-operator-uninstalling-web-console.adoc[leveloffset=+2] |
| 63 | + |
| 64 | +//Uninstalling the Dynamic Accelerator Slicer Operator using the CLI |
| 65 | +include::modules/das-operator-uninstalling-cli.adoc[leveloffset=+2] |
| 66 | + |
| 67 | +//Deploying GPU workloads with the Dynamic Accelerator Slicer Operator |
| 68 | +include::modules/das-operator-deploying-workloads.adoc[leveloffset=+1] |
| 69 | + |
| 70 | +//Troubleshooting DAS Operator |
| 71 | +include::modules/das-operator-troubleshooting.adoc[leveloffset=+1] |
| 72 | + |
| 73 | +[role="_additional-resources"] |
| 74 | +.Additional resources |
| 75 | +* link:https://github.com/kubernetes/kubernetes/issues/128043[Kubernetes issue #128043] |
| 76 | +* xref:../hardware_enablement/psap-node-feature-discovery-operator.adoc#psap-node-feature-discovery-operator[Node Feature Discovery Operator] |
| 77 | +* link:https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/troubleshooting.html[NVIDIA GPU Operator troubleshooting] |
| 78 | + |
| 79 | + |
| 80 | + |
| 81 | + |
0 commit comments