@@ -159,13 +159,41 @@ The following plugins from CodePlay are supported:
159159.. _codeplay_nv_plugin : https://developer.codeplay.com/products/oneapi/nvidia/
160160.. _codeplay_amd_plugin : https://developer.codeplay.com/products/oneapi/amd/
161161
162- ``dpctl `` can be built for CUDA devices as follows:
162+ Builds for CUDA and AMD devices internally use SYCL alias targets that are passed to the compiler.
163+ A full list of available SYCL alias targets is available in the
164+ `DPC++ Compiler User Manual <https://intel.github.io/llvm/UsersManual.html >`_.
165+
166+ CUDA build
167+ ~~~~~~~~~~
168+
169+ ``dpctl `` can be built for CUDA devices using the ``DPCTL_TARGET_CUDA `` CMake option,
170+ which accepts a specific compute architecture string:
171+
172+ .. code-block :: bash
173+
174+ python scripts/build_locally.py --verbose --cmake-opts=" -DDPCTL_TARGET_CUDA=sm_80"
175+
176+ To use the default architecture (``sm_50 ``),
177+ set ``DPCTL_TARGET_CUDA `` to a value such as ``ON ``, ``TRUE ``, ``YES ``, ``Y ``, or ``1 ``:
163178
164179.. code-block :: bash
165180
166181 python scripts/build_locally.py --verbose --cmake-opts=" -DDPCTL_TARGET_CUDA=ON"
167182
168- And for AMD devices
183+ Note that kernels are built for the default architecture (``sm_50 ``), allowing them to work on a
184+ wider range of architectures, but limiting the usage of more recent CUDA features.
185+
186+ For reference, compute architecture strings like ``sm_80 `` correspond to specific
187+ CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to ``sm_80 ``).
188+ A complete mapping between NVIDIA GPU models and their respective
189+ Compute Capabilities can be found in the official
190+ `CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus >`_ documentation.
191+
192+ AMD build
193+ ~~~~~~~~~
194+
195+ ``dpctl `` can be built for AMD devices using the ``DPCTL_TARGET_HIP `` CMake option,
196+ which requires specifying a compute architecture string:
169197
170198.. code-block :: bash
171199
@@ -174,8 +202,13 @@ And for AMD devices
174202 Note that the `oneAPI for AMD GPUs ` plugin requires the architecture be specified and only
175203one architecture can be specified at a time.
176204
177- It is, however, possible to build for Intel devices, CUDA devices, and an AMD device
178- architecture all at once:
205+ Multi-target build
206+ ~~~~~~~~~~~~~~~~~~
207+
208+ The default ``dpctl `` build from the source enables support of Intel devices only.
209+ Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
210+ device in ``dpctl ``. Besides, the support can be also extended to enable both CUDA and AMD
211+ devices at the same time:
179212
180213.. code-block :: bash
181214
0 commit comments